Final - Volume 3 No 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 105

FUNCTIONAL TESTING OF A MICROPROCESSOR

THROUGH LINEAR CHECKING METHOD

Prof. Dr. Pervez Akhtar


National University of Science & Technology,
Karachi Campus, Pakistan
[email protected]

Prof. Dr. M.Altaf Mukati


Hamdard Institute of Information Technology,
Hamdard University, Karachi, Pakistan
[email protected]

ABSTRACT
The gate-level testing also called low-level testing is generally appropriate at the
design time and for small circuits. The chip-level testing and board-level testing
also called high-level testing are preferred when the circuit complexities are too
high, making it difficult to perform low level testing in a reasonable amount of
time. The cost of low-level testing is also generally very high. Such high costs and
time are only justified when some design-changes are required. In this paper, a
high level quick checking method, known as Linear Checking Method, is
presented which can be used to qualify the functionality of a Microprocessor. This
can also be used to check hard faults in Memory chips.

Keywords: Microprocessors, ALU, Control Unit, Instructions.

1 INTRODUCTION abstraction [2]. The functional fault modeling should


imitate the physical defects that cause change in the
Due to the advances in the integrated circuit function or behavior, for example; the function of a
technology, more and more components are being synchronous binary up-counter is to advance one
fabricated into a tiny chip. Since the number of pins stage higher in binary value when clock hits it. A
on each chip is limited by the physical size of the physical defect, which alters this function, can be
chip, the problem of testing becomes more difficult modeled in terms of its effect on the function. Such
than ever. This problem is aggravated by the fact defect-findings are extremely important at the design
that, in nearly all cases, integrated circuit time or if the design changes are required at a later
manufacturers do not release the detailed circuit stage.
diagram of the chip to the users [1]. What, if a microprocessor does not produce the
The users are generally more interested to know correct results of any single or more functions? From
about the chip, whether is it functionally working the user’s perspective, it is enough to know which
and relied upon? if not, the whole chip is replaced function is failing, but from designer’s perspective,
with a newer one. This is contrast to the gate-level the cause of failing is also important to know, so that
testing of a digital circuit, which is used to diagnose the design changes may be carried out, if necessary.
faulty gates in the given circuit, in case of failing. This is certainly a time-taking process. For example,
The idea of using functional testing is also the gate level simulation of the Intel 8085
augmented by the fact that in case of any functional microprocessor took 400 hours of CPU-time and
failure, caused due to any fault in the chip, the user only provided 70% fault coverage [3].
can not repair the chip. Hence the users have only High level functional verification for the complex
two choices: either to continue using the chip with a Systems-On-Chip (SOCs) and microprocessors has
particular failing function, knowing that the failing become a key challenge. Functional verification and
function will not be used in the given application or Automatic Test Pattern Generator (ATPG) is one
to replace the whole chip. synergetic area that has evolved significantly in
The functional modeling is done at a higher level recent years due to the blossoming of a wide array of
of abstraction than a gate-level modeling. This in- test and verification techniques. This area will
fact exists between the Gate-level modeling and the continue to be a key focus of future Microprocessor
Behavioral modeling, which is the highest level of TEST and Verification (MTV) [4].

UbiCC Journal - Volume 3 1


Functional failures can be caused due to single or description of the chip must be known. In case of
multiple stuck-at faults in any of its functional block. microprocessor, this can be obtained through its
The functional-testing, which refers to the selection instruction set. The two most important functional
of tests that verify the functional operation of a blocks of any microprocessor are the CU (Control
device, is one of an efficient method to deal with the Unit) and the ALU (Arithmetic Logic Unit). All the
faults existing in a processor. Functional testing can instructions, at low-level, are composed of Op-Codes
also be carried out at a smaller level, for example, a and operands. An op-code, also called the Macro-
functional test of a flip-flop might be to verify instruction, goes to the CU, which decodes each
whether can it be set or reset? and further, can it hold macro-instruction into a unique set of micro-
the state or not? Similarly, the other MSI chips such instructions. The operands go to the ALU, which
as Multiplexers, Encoders, Decoders, Counters, processes it according the tasks defined within the
Hardwired-Multipliers, Binary-Adders & Subtrac- micro-instructions. In between these functional
tors, Comparators, Parity Checkers, Registers and blocks, there exists several registers for the
other similar circuits can also be verified for their temporary storage of op-codes, decoded-instructions
required functionalities. and operands.
Some designers and manufacturers provide built- The fault may occur at various places in the
in self-test (BIST) these days that generate the test on processors, causing it to function incorrectly. Some
the chip and responses are checked within the chip of the common faults are: Register Decoding Fault,
itself. However, the widespread use of such Micro-Operation Decoding Fault (caused may be due
testability techniques is hampered by a lack of tools to internal defect to the CU), Data Storage Fault
to support the designer and by the additional cost in (caused may be due to Stuck-at Fault or Pattern
chip area as well as the degradation in performance Sensitive Fault in the memory inside the
[5]. For example, the Intel 80386 microprocessor Microprocessor), Data Transfer Fault (caused may be
employs about 1.8% area overhead for BIST to test due to Stuck-at Fault or Bridging Fault on the busses
portions of the circuit [6]. connecting the various functional blocks of a
The ever increasing complexity combined with Microprocessor) or ALU Fault (caused due to
the advanced technology used in the design of the internal defect to the ALU). In each case, the given
modern microprocessors, has lead to two major microprocessor results in producing incorrect
problems in producing cost-effective, high quality function/ functions.
chips: In the subsequent sections, first the functional
verification has been described in general and then
1. Verification: This is related to validate the
the Linear Checking Method has been presented
correctness of the complex design. Simulation
through several examples. Based on the results
is the primary means of design validation used
obtained, the conclusion has been drawn and the
today. In the case of processor design
further work has been proposed.
validation, the sequences are either written
manually or generated automatically by a
random sequence generator [7].
2 FUNCTIONAL VERIFICATION
2. Testing: This is related to check the
manufactured chips for realistic defects. A The micro-instructions from the CU and the
variety of test generation and design-for- operands of an instruction are sent to the ALU
testability (DFT) techniques is used to ensure simultaneously. The ALU then carries out the
that the manufactured chips are defect-free. intended task or function. This can be shown with
the help of a block diagram, as in Fig. 1.
Both design verification and testing depend,
therefore, on test sequences used to expose either the
design faults or manufacturing defects. It has also
been found that manufacturing test pattern
generation can be used for design verification [8] and
that design verification techniques can be used to
find better manufacturing tests [9]. However, to find
the effective test patterns for either of the said
purposes is not simple, due to high complexities of
microprocessors. Hence the only effective method
left is to develop the functional tests. Considerable
work has been done in the field of microprocessor
functional testing. One of such work, known as Figure 1: Functional testing
‘Linear Checking Method’ is presented in this paper.
Before performing functional testing, functional The typical instructions are ADD, SUB, MUL,

UbiCC Journal - Volume 3 2


SHL, SHR, ROTL, ROTR, INC, DEC, COMPL, K+(n) = 4(2n – 1)
AND, OR, XOR and many others.
Where, the subscript with K represents the
function. Hence from the generalized form, we
3 LINEAR CHECKING METHOD
obtain the same value of K i.e. if n = 4, then K+(4) =
4(15) = 60.
This method can be used to test and verify, not
only the functionality of a microprocessor (more Similarly, the value of K for any instruction can
specifically ALU), but the memories as well. Linear be obtained, provided its functional description is
checking is based on computing the value of ‘K’ known. The value of K, for the various other
using the equation 3.1: frequently used instructions, can be obtained
similarly, as follows:
K = f i (x, y) + f i (x, y) + f i (x, y) + f i (x, y) (1) Again assume a 4-bit ALU. Taking x = 10 and y
Equation 1 is called the ‘basic equation’. The = 12 then x = 5 and y = 3 in all the computations.
variables x and y are the operands, ‘i’ is the
instruction. The value of K does not depend on the 3.1.1 Multiply instruction (f(x,y) = X * Y)
values of x and y, but only depends on the instruction fi(x,y) = MPY(x,y) = X * Y
and on the size of operands (number of bits in the
Hence, from equation 3.1, the value of K can be
operands). It means the value of K is unique for
obtained as follows:
every instruction. The chances are very little that the
two instructions may have the same constant value of MPY(12,10)+MPY(10,3)+MPY(5,12)
K. An 8 and 16-bit ALUs have different values of K, +MPY(5,3)
for the same instruction. Hence, in this method, K is
used as a reference value to verify the functionality 120 + 30 + 60 + 15 = 225
of an individual instruction.
Generalized form Æ K* = (2n – 1)2
3.1 Examples of functional verifications 3.1.2 Transfer instruction (f(x) = x)
This is a single valued function, hence only one
Consider a 4-bit ALU. The value of ‘K’ can be variable is taken in computation of K, thus ‘y’ is
computed as follows: ignored in equation 1.
Suppose the instruction is ADD(x, y) = x + y
Thus f i (x) = x
Here, n = 4. Let x = 5 (0101) and y = 3 (0011)
= x + x + x + x = 10 + 10 + 5 + 5 = 30
Therefore x = 1010 and y = 1100 Generalized form Æ K = 2(2n – 1)
The value of K can be obtained from Equation 1,
3.1.3 Shift-Right instruction (f(x) = SHR(x))
as follows:
It is also a single valued function. With x = 1010:
ADD(5,3)+ADD(5,12)+ADD(10,3)+
ADD(10,12) = K (a) f i (x, y) & f i (x, y) reduce to f i (x) and
8 + 17 + 13 + 22 = 60
(b) f i (x, y) & f i (x, y) reduce to f i (x)
Hence, for a 4-bit ALU, the ADD instruction
will always be tested with respect to its reference Now f i (x) represents the value of x, after SHR
value of 60, regardless what values of x and y are operation i.e. 1010 Æ 0101 and f i (x) represents the
taken, i.e. instead of 5 and 3 as in the above example,
now these values are taken as 9 and 10 respectively. value of x after SHR operation i.e. 0101 Æ 0010.
Still the value of K remains the same, as proved
Hence, K = 0101 + 0101 + 0010 + 0010
below:
or K = 5 + 5 + 2 + 2 = 14
i.e. for x = 9 (1001) and y = 10 (1010)
Generalized form Æ K = 2(2n-1 – 1)
x = 6 (0110) and y = 5 (0101)
3.1.4 Shift-Left instruction (f(x) = SHL(x))
ADD(9,10)+ADD(9,5)+ADD(10,6)+ADD(6,5)= With the same explanation as in section 3.1.3,
60 the equation 1 becomes:
The generalized formula can also be developed to
K = f i (x) + f i (x) + f i (x) + f i (x)
find the value of K for the ADD instruction, for any
size of ALU, as follows:

UbiCC Journal - Volume 3 3


Hence, with x = 1010 Æ f i (x) = 0100 & 3.1.10 Complement instruction (f(x) Æ x )
x = 0101 Æ f i (x) = 1010 K = f i (x) + f i (x) + f i (x) + f i (x)

Therefore, K = 4 + 4 + 10 + 10 = 28 = 0101 + 0101 + 1010 + 1010 = 30

Generalized form Æ K = 2(2n – 2) Generalized form Æ K = 2(2n - 1)

3.1.5 Logical-OR instruction (f(x) = x·OR·y) 3.1.11 2’s comp. instruction (f(x) Æ x + 1)
K = f i (x, y) + f i (x, y) + f i (x, y) + f i (x, y) K = f i (x) + f i (x) + f i (x) + f i (x)

= 0110 + 0110 + 1011 + 1011 = 34


= (x·OR·y)+(x·OR· y )+( x ·OR·y)+( x ·OR· y )
Generalized form Æ K = 2(2n + 1)
= 1110 + 1011 + 1101 + 0111
=14 + 11 + 13 + 7 = 45 3.2 Memory error correction
Linear checks can also be used to verify
Generalized form Æ K = 3(2 – 1) n
memories. For example, let the multiplication x*y
function is stored in the memory. Let the operands
3.1.6 Logical-AND instruction (f(x) = x·AND·y)
are 4-bit length, with x = 1010 and y = 0010, it
K = f i (x, y) + f i (x, y) + f i (x, y) + f i (x, y) means x = 0101 and y = 1101. Hence, all the four
components of equation 1 are computed and the
= (x·AND·y)+(x·AND· y )+( x ·AND·y)+
results are stored in memory, as shown in Table 1.
( x ·AND· y )
Table 1: linear checks on memories
= 1000 + 0010 + 0100 + 0001
f(x,y) 20 00010100
= 8 + 2 + 4 + 1 = 15
Generalized form Æ K = 2n – 1 f(x, y ) 130 10000010
f( x ,y) 10 00001010
3.1.7 Logical-XOR instruction (f(x) = x·XOR·y)
f( x , y ) 65 01000001
K = f i (x, y) + f i (x, y) + f i (x, y) + f i (x, y)
K 225 11100001
=(x·XOR·y)+(x·XOR· y )+( x ·XOR·y)+
( x ·XOR· y ) If the sum of four components is not equal to the
value of K, then there must be some fault existing in
= 0110 + 1001 + 1001 + 0110 the memory. Similarly, any of the preceding
functions can be used to verify the memories. The
= 6 + 9 + 9 + 6 = 30 testing can be done more accurately if the contents of
Generalized form Æ K = 2(2n – 1) f(x,y) are stored at address (x,y). In the above
example, the contents can be stored on the
3.1.8 Increment instruction (f(x) = INC(x)) corresponding addresses as shown in Table 2. If
This is also a single valued function: addition of the contents does not come equal to the
value of K, then it will indicate some fault in the
K = f i (x) + f i (x) + f i (x) + f i (x) memory. Here, the location of fault is also obtained.

= 1011 + 1011 + 0110 + 0110 Table 2: address versus contents in memory testing
= 11 + 11 + 6 + 6 = 34
Address
Generalized form Æ K = 2(2n + 1) x y
Contents

3.1.9 Decrement instruction (f(x) = DEC(x)) 1010 0010 00010100


K = f i (x) + f i (x) + f i (x) + f i (x) 1010 1101 10000010
= 1001 + 1001 + 0100 + 0100 0101 0010 00001010
= 9 + 9 + 4 + 4 = 26 0101 1101 11100001
Generalized form Æ K = 2(2 - 3) n

UbiCC Journal - Volume 3 4


4 RESULTS
Add x+y 4(2n – 1) 1020
All the computations done in the previous section
are summarized in tables 3 & 4 for n = 4 & 8 Multiply x ∗ y (2n – 1)2 65025
respectively:
Subtract x–y 0 0
Table 3: Values tabulated through linear checking
method for n = 4 Logical OR x∨y 3(2n – 1) 765

Instruction i f i (x, y) K i (n) K i (4) Logical x∧y 2n – 1 255


AND
Clear 0 0 0 Logical x⊕y 2(2n – 1) 510
XOR
Transfer x 2(2n – 1) 30 Complement 2(2n - 1) 510
x

Add x+y 4(2n – 1) 60 2’s comp. 2(2n + 1) 514


x +1

Multiply x ∗ y (2n – 1)2 225 Increment x+1 2(2n + 1) 514

Subtract x–y 0 0 Decrement x–1 2(2n - 3) 506

Logical OR x∨y 3(2n – 1) 45 Shift-Left (x2,x3,...xn,0) 2(2n – 2) 508


Logical x∧y n
2 –1 15 Shift-Right (0,x2,x3,...xn-1) 2(2n-1 – 1) 254
AND
Logical x⊕y 2(2n – 1) 30 Rotate-Left (x2,x3,...xn,x1) 2(2n - 1) 510
XOR
Complement x 2(2n - 1) 30 Rotate-Right (xn,x2,...xn-1) 2(2n - 1) 510

2’s comp. x +1 2(2n + 1) 34


5 CONCLUSION
n
Increment x+1 2(2 + 1) 34
It is concluded that the value of K can be
n obtained for any given instruction. The ‘CLR’ (clear)
Decrement x–1 2(2 - 3) 26
instruction is a special one, since it does not have any
operand; all the four components of the equation 3.1
Shift-Left (x2,x3,...xn,0) 2(2n – 2) 28 are taken as 0. Note that almost all the values
obtained for K are unique, except for ‘Transfer’,
Shift-Right (0,x2,x3,...xn-1) 2(2n-1 – 1) 14 ‘Complement’, ‘Logical XOR’ and ‘Rotate-
Left/Right’ instructions, which means if the
Rotate-Left (x2,x3,...xn,x1) 2(2n - 1) 30 instructions are transformed due to any fault in the
CU (or in any associated circuit) then these particular
Rotate-Right (xn,x2,...xn-1) 2(2n - 1) 30 functional failures cannot be distinguished but the
processor as a whole can be declared to have
contained a fault. The last column in tables 3 & 4 can
be obtained directly from the generalized forms. This
Table 4: Values tabulated through linear checking column is stored in the memory along with the
method for n = 8 relevant function.

Instruction i f i (x, y) K i (n) K i (8) 6 FUTURE WORK

Clear 0 0 0 Further research is proposed on the given method,


especially in the case when the ‘Reference Value
Transfer x 2(2n – 1) 510 (K)’ of two or more functions is obtained same i.e. to
distinguish or identify an individual failing function,
in case when their reference values happen to be the
same, as mentioned under the conclusion.

UbiCC Journal - Volume 3 5


7 REFERENCES
[5] Jian Shen, Jacob A. Abraham: Native Mode
[1] Su Stephen Y.H., Lin Tony S., Shen Li: Functional Test Generation for Processors with
Functional Testing of LSI/VLSI Digital Applications to Self Test and Design Validation,
Systems, Defense Technical Information Center, Proc. Computer Engineering Research Center,
Final Technical Report, School of Engineering The University of Texas at Austin, 1998.
Applied Science and Technology Binghampton,
NY, August 1984 [6] P. P. Gelsinger: Design and Test of the 80386,
IEEE Design and Test of Computers, Vol. 4, pp.
[2] Altaf Mukati: Fault Diagnosis and Testing of 42-50, June 1987.
Digital Circuits with an Introduction to Error
Control Coding, Pub. Higher Education [7] C. Montemayor et al: Multiprocessor Design
Commission Pakistan, ISBN: 969-417-095-8, Verification for the PowerPC 620
2006 Microprocessor, Proc. Intl. Conf. on Computer
Design, pp. 188-195, 1995.
[3] Jian Shen, Abraham J.A.: Native mode
functional test generation for processors with [8] M. S. Abadir, J. Ferguson, and T. Kirkland:
applications to self test and design validation, Logic design verification via test generation,
Test Conference (1998), Proc. International IEEE Trans. on Computer-Aided Design of
Volume, Issue, 18-23 Oct 1998 pp. 990 – 999. Integrated Circuits and Systems, Vol. 7, pp. 138-
148, 1988.
[4] Magdy S.Abadir, Li-C. Wang, Jayanta Bhadra:
Microprocessor Test and Verification (MTV [9] D. Moundanos, J. A. Abraham, and Y. V.
2006), Common Challenges and Solutions, Hoskote: A Unified Framework for Design
Seventh International Workshop, Austin, Texas, Validation and Manufacturing Test, Proc. Intl.
USA. IEEE Computer Society 4-5 December Test Conf., pp. 875-884, 1996.
2006, ISBN: 978-0-7695-2839-7

UbiCC Journal - Volume 3 6


OPERATING SYSTEMS FOR
WIRELESS SENSOR NETWORKS: AN OVERVIEW

Daniele De Caneva, Pier Luca Montessoro and Davide Pierattoni


DIEGM – University of Udine, Italy
{daniele.decaneva; montessoro; pierattoni}@uniud.it

ABSTRACT

The technological trend in the recent years has led to the emergence of complete
systems on a single chip with integrated low power communication and transducer
capabilities. This has opened the way for wireless sensor networks: a paradigm of
hundreds or even thousands of tiny, smart sensors with transducer and
communication capabilities. Manage such a complex network that has to work
unattended for months or years, being aware of the limited power resouces of
battery-supplied nodes is a challenging task. Attending that task requires an
adequate software platform, in other words an operating system specifically suited
for wireless sensor networks. This paper presents a brief overview of the most
known operating systems, highlighting the key challenges that have driven their
design.

Keywords: wireless sensor networks, operating systems.

1 INTRODUCTION the most known operating systems designed for


WSNs. Without proposing direct comparisons, we
Thanks to the well known “Moore’s Law” describe the key features of these architectures, the
integrated circuits are becoming smaller, cheaper challenges that led their development, with the aim
and less power consuming. This trend has led to the of helping the reader to choose among these
emergence of complete systems on a chip with systems the one that best suites his/her purposes.
integrated low power communication and
transducer capabilities. The consequence is the 2 OVERVIEW
opening of the ubiquitous computing era, in which
electronic systems will be all around us, providing 2.1 TinyOS
all kind of information services to users in a
distributed, omnipresent but nearly invisible fashion. TinyOS [1] is virtually the state-of-the-art of
One of the most important applications that new sensor operating systems. Berkeley University
technologies are enabling is the paradigm of researchers based their work aiming to face two
Wireless Sensor Networks (WSNs), where issues: the first was to manage the concurrency
hundreds or even thousands of tiny sensors with intensive nature of nodes which need to keep in
communication capabilities will organize movement different flows of data simultaneously,
themselves to collect important environmental data while the second was to develop a system with
or monitor areas for security purposes. efficient modularity but believing that hardware and
The hardware for WSNs is ready and many software components must snap together with little
applications have become a reality, nevertheless the processing and storage overhead. The purpose of
missing of a commonly accepted system the researchers was also to develop a system that
architecture and methodology constitute a curb to would easily scale with the current technology
the expansion and the improvement of such trends, supporting smaller devices as well as the
technologies. Aware of that, many research groups crossover of software components into hardware.
in the world have proposed their own system Considering power as the most precious
architecture. The key point in all these proposals is resource and trying to achieve high levels of
the capability of the software to manage a concurrency, the system was designed following an
considerable number of sensors. In particular, there event-based approach, which avoids reserving a
is a tradeoff between the responsiveness of the stack space for each execution context. This design
system and the extremely scarce resources of the guideline was drawn from a parallelism with high
nodes in terms of power supply, memory and performance computing, where event-based
computational capabilities. programming is the key to achieve high
In this article will be presented an overview of performance in concurrency intensive applications.

UbiCC Journal - Volume 3 7


In TinyOS neither blocking nor polling operation is 2.2 MANTIS
permitted and the CPU doesn’t waste time in
actively looking for interesting events; on the The MultimodAl system for NeTworks of In-
contrary, unused CPU cycles are spent in a sleep situ wireless Sensors [3] was developed focusing on
state. two design key: the need for a small learning curve
System configuration can be summarized in a for users and the need for flexibility. The first
tiny scheduler and a set of components. The objective led to fundamental choices in the
scheduler is a simple FIFO utilizing a bounded size architecture of the system and the programming
scheduling data structure for efficiency, nonetheless language used for its implementation. In fact, to
a more sophisticated scheduling policy could be lower the entry barrier the researchers decided to
implemented. When the task queue is empty, the adopt a largely diffuse design methodology, that is,
CPU is forced to the sleep state waiting for an the classical structure of a multithreaded operating
hardware event to trigger the scheduling of the system. For this reason MANTIS includes features
event-associated tasks. Tasks in the TinyOS like multithreading, preemptive scheduling with
architecture are atomic and run to completion time slices, I/O synchronization via mutual
although they can be preempted by events. This exclusion, and standard network stack and device
semantics of tasks allows the allocation of a single drivers. The second choice is associated with the
stack, which is an important feature in memory purpose of flattening the learning curve for users
constrained systems. and determinates the use of standard C as
Three types of components were thought to developing language for the kernel and the API.
constitute the TinyOS architecture. The first type of The choice of C language additionally entails the
components are the “hardware abstraction” which cross-platform support and the reuse of a vast
map physical hardware like I/O devices into legacy code base.
component models. The second type of components MANTIS kernel resembles UNIX-style
is called “synthetic hardware” and simulates the schedulers providing services for a subset of
behavior of advanced hardware; often synthetic POSIX threads along with priority based scheduling.
hardware sits on top of the hardware abstraction The thread table is allocated statically, so it can be
components. The last type of components are the adjusted only at compile time. The scheduler
“high level software” components, which perform receives notes to trigger context switches from an
control, routing and all data transformations such as hardware timer interrupt. This interrupt is the only
data aggregation and manipulation. This kind of kind of hardware interrupt handled by the kernel: in
abstraction of hardware and software in the fact all the others interrupts are sent directly to
component model is intended to ease the device drivers. Context switches are triggered not
exploitation of tradeoffs between the scale of only by timer events but also by system calls or
integration, the power requirements and the cost of semaphore operations. Besides drivers and user
the system. Every component owns a fixed-size threads, MANTIS has a special idle, low-priority
frame that is statically allocated: this allows to thread created by the kernel at startup. This thread
know exactly the memory requirements of a could be used to implement power-aware
component at compile time and prevents the scheduling: thanks to its position, it can detect
overhead associated with dynamic allocation. patterns of CPU utilization and adjust kernel
TinyOS was originally developed in C, giving parameters to conserve energy.
the system the capability of targeting multiple CPU MANTIS researchers thought that wireless
architectures. However, the system was afterwards networking management was a critical matter, so
re-implemented in nesC: this is a programming they developed the layered network stack as a set of
language specific for networked embedded systems, user level threads. In other words, they
whose key focus is holistic approach in design. implemented different layers in different threads
It’s remarkable that for TinyOS a byte-code advocating that this choice promotes flexibility to
interpreter has been developed that makes the the detriment of performance. This flexible
system accessible to non-expert programmers and structure is useful in particular for dynamic
enables quick and efficient programming of a reprogramming, because it enables application
whole WSN. This interpreter, called Maté, depicts developers to reprogram network functionalities
the program’s code as made of capsules. Thanks to such as routing by simply starting, stopping and
the beaconless, ad-hoc routing protocol deleting user-level threads.
implemented in Maté, when a sensor node receives Drawing from the experience in WSNs, the
a newer version of a capsule, it will install it. developers of MANTIS gave their system a set of
Through a hop-by-hop code injection, Maté can sophisticated features like dynamic reprogramming
update the code of the entire network. of sensors nodes via wireless communication,
remote debugging and multimodal prototyping.
MANTIS prototyping environment provides a
framework for testing devices and applications

UbiCC Journal - Volume 3 8


across heterogeneous platforms. It extends beyond linking. Dynamic linking is based on synchronous
simulation permitting the coexistence of both events: a library function is invoked by issuing an
virtual and physical nodes in the same network. event generated by the caller program. The event
This feature is derived directly by the system code broadcasts the request to all the libraries and a
architecture, which can run without modifications rendezvous protocol is used to find out the library
on virtual nodes within an x86 architecture. that implements the required function. When the
Dynamic reprogramming in MANTIS is correct library has completed the call, the control
implemented as a system call library, which is built returns back to the calling process. Since dynamic
into the kernel. There are different granularities of linking bases its functioning on synchronous events,
reprogramming: entire kernel reflashing, it is essential that context switching overhead is as
reprogramming of a single thread, changing of small as possible, in order to have a good system
variables within the thread. Along with dynamic performance. Contiki developers have granted this
reprogramming, an important feature has been also by implementing processes as event handlers,
developed: the Remote Shell and Command Server which run without separate protection domains.
which allows the user “logging in” into a node and The flexible mechanism of dynamic linking
taking control of it. The server is implemented as an allowed Contiki researchers to implement
application thread and gives the user the ability to multithreading as a library optionally linked with
alter node’s configuration, run or kill programs, programs. Another important component based on a
inspect and modify the inner state of the node. shared library is the communication stack.
Implementing the communication stack as a library
allows its dynamic replacement and, more precisely,
2.3 Contiki if the stack is split into different libraries it
becomes easy to replace a communication layer on
Contiki is an operating system based on a the run.
lightweight event-driven kernel. It was developed
drawing from previous operating systems’ works
with the goal of adding features like run-time 2.4 PicOS
loading and linking of libraries, programs and
device drivers, as well as support for preemptive PicOS is an operating system written in C and
multithreading. specifically aimed for microcontrollers with limited
Event-based systems have shown good RAM on chip. In the attempt to ease the
performance for many kind of WSN’s applications; implementation of applications with constrained
however, purely event-based systems have the resource hardware platforms, PicOS creators leaned
penalty of being unable to respond to external towards a programming environment, which is a
events during long-lasting computations. A partial collection of functions for organizing multiple
solution to this problem is adding multithreading activities of “reactive” applications. This
support to the system, but this would cause environment is capable to provide services like a
additional overhead. To address these problems flavor of multitasking and tools for inter-process
Contiki researchers have done the compromise of communication.
developing an event-driven kernel and Each process is thought as a FSM that changes
implementing preemptive multithreading features its state according to the events. This approach is
as a library, which is optionally linked with very effective for reactive applications, whose
programs that explicitly require it. primary role is to respond to events rather than
Contiki operating system can be divided in three processing data or crunching numbers. The CPU
main components: an event driven kernel that multiplexing happens only at state boundaries: in
provides basic CPU multiplexing and has no other words FSM states can be viewed as
platform-specific code, a program loader and checkpoints, at which PicOS processes can be
finally a set of libraries that provide higher level preempted. Owing the fact that processes are
functionalities. preemptible at clearly defined points, potentially
From a structural point of view, a system problematic operations on counters and flags are
running Contiki can be partitioned in two parts: a always atomic. On the other hand, such non-
core and a set of loadable programs. The core is preemptible character of PicOS processes makes
compiled into a single binary image and is this system not well suitable for real time
unmodifiable after nodes’ deployment. The applications. In PicOS active processes that need to
programs are loaded into the system by the program wait for some events may release the CPU by
loader, which may obtain the binaries either from issuing a “wait request”, which defines the
the communication stack (and thus from the conditions necessary to their recovery. This way the
network) or from the system’s EEPROM memory. CPU resources could be destined to other processes.
Shared libraries like user programs may be The PicOS system is also equipped with several
replaced in deployed systems by using the dynamic advanced features, like a memory allocator capable

UbiCC Journal - Volume 3 9


of organizing the heap area into a number of EYES European project and tries to address the
different disjoint pools, and a set of configurable problems of scarce resources in terms of both the
device drivers including serial ports, LCD displays memory and power supply and the need for
and Ethernet interfaces. distribution and reconfiguration capabilities.
The researchers found solution to these
problems developing a event-driven system. In fact,
2.5 MagnetOS EYES OS is structured in modules that are executed
as responses to external events, leaving the system
Applications often need to adapt not only to in a power saving mode when there is no external
external changes but also to the internal changes event to serve. Every module can ask for several
initiated by the applications themselves. An tasks to be performed; each task in turn defines a
example may come from a battlefront application certain block of code that runs to completion. In
that may modify its behavior switching from the this paradigm, no blocking operation is permitted
defensive to the offensive mode: this application and no polling operation should be instantiated: the
could change its communication pattern and programmer instead will use interrupts to wake up
reorganize the deployed components. Focusing on the system when the needed input becomes
this point, researchers at the Cornell University available.
argued that network-wide energy management is The system provides a scheduler which can be
best provided by a distributed, power-aware implemented as a simple FIFO or a more
operating system. Those researchers developed sophisticated algorithm. The interrupts are also seen
MagnetOS aiming to the following four goals. The as tasks scheduled and ready to be executed.
first was the adaptability to resources and network In the EYES architecture there are two system
changes. The second was to follow efficient layers of abstraction. The first layer is the “Sensor
policies in terms of power consumption. The third and Networking Layer”, which provides an API for
goal was giving the OS general-purpose the sensor nodes and the network protocols. The
characteristics, allowing it to execute applications second layer is the “Distributed Services Layer”,
over networks of nodes with heterogeneous which exposes an API for mobile sensor
capabilities and handling different hardware and applications support. In particular, two services
software choices. The fourth goal was providing the belong to this layer: the “Lookup Service” and the
system with facilities for deploying, managing and “Information Service”. The first supports mobility,
modifying executing applications. instantiation and reconfiguration, while the latter
The result was a system providing a SSI, deals with aspects of collecting data. On top of
namely a Single System Image. In this abstraction, cited layers stand the user applications.
the entire network is presented to applications as a The EYES OS provides a four-step procedure
single unified Java virtual machine. The system, for code distribution, designed to update the code
which follows the Distributed Virtual Machine into nodes, including the operating system. This
paradigm, may be partitioned in a static and in a procedure is resilient to packet losses during the
dynamic component. The static component rewrites update, using as few communication and local
regular Java applications into objects that can be resources as possible and halting the node
distributed across the network. The dynamic operations only for a short period.
component provides on each node services for
application monitoring and for object creation,
invocation and migration. In order to achieve good 3 CONCLUSIONS
performance an auxiliary interface is provided by
the MagnetOS runtime that overrides the automatic The operating systems here described present
object placement decisions and allows different approaches to the common problems of
programmers to explicitly direct object placement. WSNs. It is not in the aim of this article to express
MagnetOS uses two online power-aware opinions about the presented systems; nevertheless,
algorithms to reduce application energy some general guidelines could be drawn from the
consumption and to increase system survival by work experience made by all the esteemed
moving application components within the entire researchers.
network. In practice, these protocols try to move he We present now some guidelines for the
communication endpoints in order to conserve development of the next generation of WSN
energy. The first of them, called NetPull, works at operating systems, that should help both researchers
the physical layer whereas the second one, called and users.
NetCenter, works at the network layer.
The constrained nature of resources in
2.6 EYES embedded systems is definitely evident, so a
small, efficient code is a primary goal, as well
This operating system was developed within the as power-aware policies are an obligatory

UbiCC Journal - Volume 3 10


Table 1: summary of WSN OS features.

TinyOS Mantis
Objectives Manage concurrent data flows Small learning curve
Scale easily with technology
Modularity
Structure Event-based approach Multithreaded OS,
Tiny scheduler and a set of components UNIX-style scheduler
No blocking or polling Statically-allocated thread table
Developed in nesC Developed in C
Special A byte code interpreter for non-expert Specific idle task that adjusts kernel
features programmers parameters to conserve energy
Remote debugging and reprogramming

Contiki PicOS
Objectives Preemptive multithreading support Aimed for microcontrollers with tiny RAM
Runtime loading and linking of libraries
Structure Lightweight event-driven kernel Each process thought as a FSM
Multithreading features as an optionally Multiplexing at state boundaries
linked library Written in C
Special Capable of changing communication layer on Memory allocator
features the run A set of configurable device drivers

MagnetOS EYES
Objectives Adaptability to resource and network changes Address problems of scarce memory and
Manage nodes with heterogeneous power supply
capabilities
Structure Single System Image, the entire network is a Event driven OS
unified Java virtual machine Structured in modules executed as responses
to external events
Each task runs to completion
Special Two on-line special algorithms to reduce Two layers of abstraction with specific API
features energy consumption for applications and physical support
Four-step procedure to update the code

Table 2: the seven expected features of the next generation WSN operating systems.

Power-aware policies

Self organization

Easy interface to expose data


Simple way to program, update and debug
network applications
Power-aware communication protocols

Portability
Easy programming language for non-tech
users

UbiCC Journal - Volume 3 11


condition to exploit the efficiency in WSN then program developers will not be just
applications. communication and computer engineers. It
To ensure a proper functioning of a network, appears clear that, in order to support non-
which is constituted by unattended nodes that technical developers, a really simple API or
could have been deployed in a harsh even an application-typology programming
environment, the operating system must provide language must be provided, alongside with the
a mechanism for self-organization and “normal” and more efficient API. Making WSN
reorganization in case of node failures. easy to use will make them more attractive and
A WSN, especially if composed of a huge step up their diffusion.
number of nodes, must behave as a distributed
system, exposing an interface where data and 4 REFERENCES
processes are accessible and manageable like it
happens with databases. [1] J. Hill, R. Szewczyk, A. Woo, S. Hollar, D.
A large number of nodes carries also the need Culler, K. Pister: System Architecture
for an easy, yet power efficient way to program Directions for Networked Sensors, ASPLOS.
the network, which should be also usable after (2000).
the deployment and without affecting normal [2] K. Sohraby, D. Minoli, T. Znati: Wireless
functioning. Such a programming (and re- Sensor Networks: technology, Protocols and
programming) procedure must be robust to Applications, John Wiley & Sons Inc. (2007).
interference and to all other causes of [3] H. Abrach, S. Bhatti, J. Carlson, H. Dai J. Rose,
transmission failures during the dissemination A. sheth, B. Sheth, B. Shucker, J. Deng, R. Han:
of code chunks. While the entire re- MANTIS: System Support for MultimodAl
programming of the core of the system may not NeTworks of In-situ Sensors, Proceedings of the
be necessary, the applications must be patched, 2nd ACM International Conference on Wireless
updated or even totally changed if the main Sensor Networks and Applications. (2003).
purpose of the WSN is changed. This leads to [4] A. Dunkels, B. Grönvall, T. Voigt, J. Alonso:
the preference, if possible, of different levels of The Design for a Lightweight Portable
re-programming granularity. Operating System for Tiny Networked Sensor
The operating system must treat wireless Devices, SICS Technical Report (2004).
communication interfaces as special resources, [5] E. Akhmetshina, P. Gburzynski, F. Vizecoumar:
providing a set of different power aware PicOS: A Tiny Operation System for Extremely
communication protocols. The system has to Small Embedded Platforms, Proceedings of the
choose the proper protocol, according to the Conference on Embedded System and
current environment state and application needs. Applications ESA’02 (2002).
The operating system should be portable to [6] R. Barr, J. Bicket, D. S. Dantas, B. Du, T.W.D.
different platforms: this is necessary both for the Kim, B. Zhou, E. Sirer: On the need for system-
possible presence of nodes with different tasks level support for ad hoc and sensor networks,
and for the opportunity of a post-deployment of SIGOPS Oper. Syst. Rev. (2002).
newer sensors, which could be placed in order [7] S. Dulman, P. Havinga: Operating System
to reintegrate the network node set. Fundamentals for the EYES Distributed Sensor
The operating system should provide a Network, Proceedings of Progress’02 (2002).
platform for fast prototyping, testing and
debugging application programs. In this context
it is remarkable to note that, if the WSN
paradigm will spread in a kaleidoscopic set of
applications, touching many aspects of our life,

UbiCC Journal - Volume 3 12


Performance Evaluation of Deadline Monotonic Policy
over 802.11 protocol

Ines El Korbi and Leila Azouz Saidane


National School of Computer Science
University of Manouba, 2010 Tunisia
Emails: [email protected] [email protected]

ABSTRACT
Real time applications are characterized by their delay bounds. To satisfy the
Quality of Service (QoS) requirements of such flows over wireless
communications, we enhance the 802.11 protocol to support the Deadline
Monotonic (DM) scheduling policy. Then, we propose to evaluate the performance
of DM in terms of throughput, average medium access delay and medium access
delay distrbution. To evaluate the performance of the DM policy, we develop a
Markov chain based analytical model and derive expressions of the throughput, the
average MAC layer service time and the service time distribution. Therefore, we
validate the mathematical model and extend analytial results to a multi-hop
network by simulation using the ns-2 network simulator.

Keywords: Deadline Monotonic, 802.11, Performance evaluation, Average


medium access delay, Throughput, Probabilistic medium access delay bounds.

1 INTRODUCTION use a distributed scheduling and introduce a new


medium access backoff policy. Therefore, we focus
Supporting applications with QoS requirements on performance evaluation of the DM policy in terms
has become an important challenge for all of achievable throughput, average MAC layer
communications networks. In wireless LANs, the service time and MAC layer service time
IEEE 802.11 protocol [5] has been enhanced and the distribution. Hence, we follow these steps:
IEEE 802.11e protocol [6] was proposed to support  First, we propose a Markov Chain
quality of service over wireless communications. framework modeling the backoff process of
In the absence of a coordination point, the IEEE n contending stations within the same
802.11 defines the Distributed Coordination broadcast region [1].
Function (DCF) based on the Carrier Sense Multiple Due to the complexity of the mathematical
Access with Collision Avoidance (CSMA/CA) model, we restrict the analysis to n
protocol. The IEEE 802.11e proposes the Enhanced contending stations belonging to two traffic
Distributed Channel Access (EDCA) as an extension categories (each traffic category is
for DCF. With EDCA, each station maintains four characterized by its own delay bound).
priorities called Access Categories (ACs). The  From the analytical model, we derive the
quality of service offered to each flow depends on throughput achieved by each traffic
the AC to which it belongs. category.
Nevertheless, the granularity of service offered  Then, we use the generalized Z-transforms
by 802.11e (4 priorities at most) can not satisfy the [3] to derive expressions of the average
real time flows requirements (where each flow is MAC layer service time and the service
characterized by its own delay bound). time distribution.
 As the analytical model was restricted to
Therefore, we propose in this paper a new two traffic categories, analytical results are
medium access mechanism based on the Deadline extended by simulation to different traffic
Monotonic (DM) policy [9] to schedule real time categories.
flows over 802.11. Indeed DM is a real time
 Finally, we consider a simple multi-hop
scheduling policy that assigns static priorities to flow
scenario to deduce the behavior of the DM
packets according to their deadlines; the packet with
policy in a multi hop environment.
the shortest deadline being assigned the highest
priority. To support the DM policy over 802.11, we

UbiCC Journal - Volume 3 13


The rest of this paper is organized as follows. In maximum achievable throughput. The native model
section 2, we review the state of the art of the IEEE is also extended in [10] to a non saturated
802.11 DCF, QoS support over 802.11 mainly the environment. In [12], the authors derive the average
IEEE 80.211e EDCA and real time scheduling over packet service time at a 802.11 node. A new
802.11. In section 3, we present the distributed generalized Z-transform based framework has been
scheduling and introduce the new medium access proposed in [3] to derive probabilistic bounds on
backoff policy to support DM over 802.11. In section MAC layer service time. Therefore, it would be
4, we present our mathematical model based on possible to provide probabilistic end to end delay
Markov chain analysis. Section 5 and 6 present bounds in a wireless network.
respectively throughput and the service time
analysis. Analytical results are validated by 2.2 Supporting QoS over 802.11
simulation using the ns-2 network simulator [16]. In 2.2.1 Differentiation mechanisms over 802.11
section 7, we extend our study by simulation, first to Emerging applications like audio and video
take into consideration different traffic categories, applications require quality of service guarantees in
second, to study the behavior of the DM algorithm in terms of throughput delay, jitter, loss rate, etc.
a multi-hop environment where factors like Transmitting such flows over wireless
interferences or routing protocols exist. Finally, we communications require supporting service
conclude the paper in section 8. differentiation mechanisms over such networks.

2 LITTERATURE REVIEWS Many medium access schemes have been


proposed to provide some QoS enhancements over
2.1 The 802.11 protocol the IEEE 802.11 WLAN. Indeed, [4] assigns
2.1.1 Description of the IEEE 802.11 DCF different priorities to the incoming flows. Priority
Using DCF, a station shall ensure that the classes are differentiated according to one of three
channel is idle when it attempts to transmit. Then it 802.11 parameters: the backoff increase function, the
selects a random backoff in the contention Inter Frame Spacing (IFS) and the maximum frame
window 0 , CW  1 , where CW is the current length. Experiments show that all the three
window size and varies between the minimum and differentiation schemes offer better guarantees for
the maximum contention window sizes. If the the highest priority flow. But the backoff increase
channel is sensed busy, the station suspends its function mechanism doesn’t perform well with TCP
backoff until the channel becomes idle for a flows because ACKs affect the differentiation
Distributed Inter Frame Space (DIFS) after a mechanism.
successful transmission or an Extended Inter Frame
Space (EIFS) after a collision. The packet is In [7], an algorithm is proposed to provide
transmitted when the backoff reaches zero. A packet service differentiation using two parameters of IEEE
is dropped if it collides after maximum 802.11, the backoff interval and the IFS. With this
retransmission attempts. scheme high priority stations are more likely to
The above described two way handshaking access the medium than low priority ones. The above
packet transmission procedure is called basic access described researches led to the standardization of a
mechanism. DCF defines a four way handshaking new protocol that supports QoS over 802.11, the
technique called Request To Send/Clear To Send IEEE 802.11e protocol [6].
(RTS/CTS) to prevent the hidden station problem. A
station S j is said to be hidden from S i if S j is 2.2.2 The IEEE 802.11e EDCA
The IEEE 802.11e proposes a new medium
within the transmission range of the receiver of S i access mechanism called the Enhanced Distributed
and out of the transmission range of S i . Channel Access (EDCA), that enhances the IEEE
2.1.2 Performance evaluation of the 802.11 802.11 DCF. With EDCA, each station maintains
DCF four priorities called Access Categories (ACs). Each
Different works have been proposed to evaluate access category is characterized by a minimum and a
the performance of the 802.11 protocol based on maximum contention window sizes and an
Bianchi’s work [1]. Indeed, Bianchi proposed a Arbitration Inter Frame Spacing (AIFS).
Markov chain based analytical model to evaluate the
saturation throughput of the 802.11 protocol. By Different analytical models have been proposed
saturation conditions, it’s meant that contending to evaluate the performance of 802.11e EDCA. In
stations have always packets to transmit. [17], Xiao extends Bianchi’s model to the prioritized
Several works extended the Bianchi model either schemes provided by 802.11e by introducing
to suit more realistic scenarios or to evaluate other multiple ACs with distinct minimum and maximum
performance parameters. Indeed, the authors of [2] contention window sizes. But the AIFS
incorporate the frame retry limits in the Bianchi’s differentiation parameter is lacking in Xiao’s model.
model and show that Bianchi overestimates the Recently Osterbo and Al. have proposed

UbiCC Journal - Volume 3 - 14


different works to evaluate the performance of the backoff value is inferred from the deadline
IEEE 802.11e EDCA [13], [14], [15]. They proposed information.
a model that takes into consideration all the
differentiation parameters of the EDCA especially 3 SUPPORTING DEADLINE MONOTONIC
the AIFS one. Moreover different parameters of QoS (DM) POLICY OVER 802.11
have been evaluated such as throughput, average
service time, service time distribution and With DCF all the stations share the same
probabilistic response time bounds for both saturated transmission medium. Then, the HOL (Head of Line)
and non saturated cases. packets of all the stations (highest priority packets)
will contend for the channel with the same priority
Although the IEEE 802.11e EDCA classifies the even if they have different deadlines.
traffic into four prioritized ACs, there is still no Introducing DM over 802.11 allows stations
guarantee of real time transmission service. This is having packets with short deadlines to access the
due to the lack of a satisfactory scheduling method channel with higher priority than those having
for various delay-sensitive flows. Hence, we need a packets with long deadlines. Providing such a QoS
scheduling policy dedicated to such delay sensitive requires distributed scheduling and a new medium
flows. access policy.

2.3 Real time scheduling over 802.11 3.1 Distributed Scheduling over 802.11
To realize a distributed scheduling over 802.11,
A distributed solution for the support of real- we introduce a priority broadcast mechanism similar
time sources over IEEE 802.11, called Blackburst, is to [18]. Indeed each station maintains a local
discussed in [8]. This scheme modifies the MAC scheduling table with entries for HOL packets of all
protocol to send short transmissions in order to gain other stations. Each entry in the scheduling table of
priority for real-time service. It is shown that this  
node S i comprises two fields S j , D j where S j is
approach is able to support bounded delays. The
main drawback of this scheme is that it requires the source node MAC address and D j is the
constant intervals for high priority traffic; otherwise deadline of the HOL packet of node S j . To
the performance degrades very much. broadcast HOL packets deadlines, we propose to use
the two way handshake DATA/ACK access mode.
In [18], the authors introduced a distributed
priority scheduling over 802.11 to support a class of
When a node S i transmits a DATA packet, it
dynamic priority schedulers such as Earliest
Deadline First (EDF) or Virtual Clock (VC). Indeed, piggybacks the deadline of its HOL packet. Nodes
the EDF policy is used to schedule real time flows hearing the DATA packet add an entry for S i in
according to their absolute deadlines, where the their local scheduling tables by filling the
absolute deadline is the node arrival time plus the corresponding fields. The receiver of the DATA
delay bound. packet copies the priority of the HOL packet in ACK
To realize a distributed scheduling over 802.11, before sending the ACK frame. All the stations that
the authors of [18] used a priority broadcast did not hear the DATA packet add an entry for S i
mechanism where each station maintains an entry for using the information in the ACK packet.
the highest priority packet of all other stations. Thus,
stations can adjust their backoff according to other 3.2 DM medium access backoff policy
stations priorities. Let’s consider two stations S 1 S2and
The overhead introduced by the broadcast transmitting two flows with the same deadline D1
priority mechanism is negligible. This is due to the ( D1 is expressed as a number of 802.11 slots). The
fact that priorities are exchanged using native DATA two stations having the same delay bound can access
and ACK packets. Nevertheless, authors of [18] the channel with the same priority using the native
proposed a generic backoff policy that can be used 802.11 DCF.
by a class of dynamic priority schedulers no matter if Now, we suppose that S 1 and S 2 transmit flows
this scheduler targets delay sensitive flows or rate
sensitive flows. with different delay bounds D1 and D2 such as
D1  D2 , and generate two packets at time instants
In this paper, we focus on delay sensitive flows t 1 and t 2 . If S 2 had the same delay bound as S 1 ,
and propose to support the fixed priority Deadline
Monotonic (DM) policy over 802.11 to schedule its packet would have been generated at time t '2 such
delay sensitive flows. For instance, we use a priority as t '2  t 2  D 21 , where D 21  D 2  D1  .
broadcast mechanism similar to [18] and introduce a
At that time, S 1 and S 2 would have the same
new medium access backoff policy where the
priority and transmit their packets according to the

UbiCC Journal - Volume 3 - 15


802.11 protocol. following assumptions:
Thus, to support DM over 802.11, each station
uses a new backoff policy where the backoff is given Assumption 1:
by: The system under study comprises n contending
 The random backoff selected in 0 , CW  1 stations hearing each other transmissions.
according to 802.11 DCF, referred as BAsic
Backoff (BAB). Assumption 2:
 The DM Shifting Backoff (DMSB): Each station S i transmits a flow Fi with a delay
corresponds to the additional backoff slots that bound Di . The n stations are divided into two
a station with low priority (the HOL packet
traffic categories C1 and C 2 such as:
having a large deadline) adds to its BAB to
have the same priority as the station with the  C1 represents n1 nodes transmitting flows
highest priority (the HOL packet having the with delay bound D1 .
shortest deadline).  C 2 represents n 2 nodes transmitting flows
Whenever a station S i sends an ACK or hears with delay bound D2 , such as D1  D2 ,
an ACK on the channel its DMSB is revaluated as D 21  D 2  D1  and n1  n 2   n .
follows:
Assumption 3:
DMSBS i   DeadlineHOLS i   DTmin S i  (1) We operate in saturation conditions: each station has
immediately a packet available for transmission after
the service completion of the previous packet [1].
Where DTmin S i  is the minimum of the HOL
packet deadlines present in S i scheduling table and Assumption 4:
DeadlineHOLS i  is the HOL packet deadline of A station selects a BAB in a constant contention
node S i . window 0 ,W  1 independently of the transmission
attempt. This is a simplifying assumption to limit the
Hence, when S i has to transmit its HOL packet
complexity of the mathematical model.
with a delay bound Di , it selects a BAB in the
contention window 0 , CW min  1 and computes the Assumption 5:
WHole Backoff (WHB) value as follows: We are in stationary conditions, i.e. the n stations
have already sent one packet at least.
WHBS i   DMSBS i   BAB S i  (2)
Depending on the traffic category to which it
belongs, each station S i will be modeled by a
The station S i  decrements its BAB when it
Markov Chain representing its whole backoff (WHB)
senses an idle slot. Now, we suppose that S i senses process.
the channel busy. If a successful transmission is
heard, then S i  revaluates its DMSB when a correct 4.1 Markov chain modeling a station of category
ACK is heard. Then the station S i adds the new C1
Figure 1 illustrates the Markov chain modeling a
DMSB value to its current BAB as in equation (2).
station S 1 of category C1 . The states of this Markov
Whereas, if a collision is heard, S i reinitializes its
DMSB and adds it to its current BAB to allow chain are described by the following quadruplet
colliding stations contending with the same priority R , i , i  j , D21  where:
as for their first transmission attempt. S i transmits
 R : takes two values denoted by C 2 and
when its WHB reaches 0. If the transmission fails, S i
doubles its contention window size and repeats the ~ C 2 . When R ~ C 2 , the n 2 stations of
above procedure until the packet is successfully category C 2 are decrementing their shifting
transmitted or dropped after maximum backoff (DMSB) during D21 slots and
retransmission attempts.
wouldn’t contend for the channel. When
R  C 2 , the D21 slots had already been
4 MATHEMATICAL MODEL OF THE DM elapsed and stations of category C 2 will
POLICY OVER 802.11 contend for the channel.
 i : the value of the BAB selected by S 1 in
In this section, we propose a mathematical
model to evaluate the performance of the DM policy
0 ,W  1 .
using Markov chain analysis [1]. We consider the

UbiCC Journal - Volume 3 - 16


Figure 1: Markov chain modeling a category C1 Station

 i  j  : corresponds to the current backoff moves to the state ~ C 2 , i  j , i  j , D 21 ,


of the station S 1 . i  1..W  1 , j  0.. minD21  1, i  1 .
 D21 : corresponds to D 2  D1  . We choose
the negative notation  D21 for stations of Now, If S1 is in one of the states
C1 to express the fact that only stations of C 2 , i , i  D21 , D21  , i  D21  1..W  1 and at
category C 2 have a positive DMSB equal to least one of the n  1 remaining stations (either a
D21 . category C1 or a category C 2 station) transmits,
Initially S 1 selects a random BAB and is in then S1 moves to one of the states
one of the states ~ C 2 , i , i , D 21  , i  0..W  1 . ~ C 2 , i  D21 , i  D21 , D21  , i  D21  1..W  1 .
During D21  1 slots, S 1 decrements its backoff if
Markov chain modeling a station of
none of the n1  1 remaining stations of category
4.2
category C2
C1 transmits. Indeed, during these slots, the n 2 Figure 2 illustrates the Markov chain modeling
a station S 2 of category C 2 . Each state of S 2
stations of category C 2 are decrementing their
Markov chain is represented by the quadruplet
DMSB and wouldn’t contend for the channel.
i , k , D21  j , D21  where:
When S1 is in one of the states  i : refers to the BAB value selected by S 2 in
~ C 2 , i , i  D21  1, D21  , i  D 21 ..W  1 and 0 ,W  1 .
senses the channel idle, it decrements its th
D21 slot.  k : refers to the current BAB value of S 2 .
But S 1 knows that henceforth the n 2 stations of  D 21  j : refers to the current DMSB of S 2 ,
category C 2 can contend for the channel (the D21 j  0 , D21  .
slots had been elapsed). Hence, S 1 moves to one of  D21 : corresponds to D  D  .
2 1
the states C 2 , i , i  D21 , D 21  , i  D21 ..W  1 .
When S 2 selects a BAB, its DMSB equals
However, when the station S 1 is in one of the D 21 and is in one of the states i , i , D 21 , D 21  ,
states ~ C 2 , i , i  j , D 21  , for i  1..W  1 , i  0..W  1 . During D 21 slots, only the n1
j  0.. minD 21  1, i  1 and at least one of the stations of category C1 contend for the channel.
n1  1 remaining stations of category
C1
transmits, then the stations of category C 2 will If S 2 senses the channel idle during D21 slots, it
reinitialize their DMSB and wouldn’t contend for moves to one of the states i , i ,0 , D 21  , i  0..W  1 ,
channel during additional D21 slots. Therefore, S 1 where it ends its shifting backoff.

UbiCC Journal - Volume 3 - 17


Figure 2: Markov chain modeling a category C 2 Station

When S 2 is in one of the states i , i ,0 , D 21  ,  2  i , i , D 21  j , D 21 , i  0..W  1,


i  0..W  1 , the n 2  1 other stations of category j  0..D 21  1
C 2 have also decremented their DMSB and can   2 : the set of states of S 2 , where stations
contend for the channel. Thus, S 2 decrements its of category C 2 contend for the channel
BAB and moves to the state i , i  1,0 , D 21  , (pink states in figure 2).
i  2..W  1 , only if none of the n  1 remaining  2  i , i ,0 , D21 , i  0..W  1
stations transmits.  i , i  1,0 , D 21 , i  2..W  1

If S 2 is in one of the states i , i  1,0 , D 21  , Therefore, when stations of category C1 are in


i  2..W  1 , and at least one of the n  1 one the states of  1 , stations of category C 2 are in
remaining stations transmits, the n 2 stations of one of the states of  2 . Similarly, when stations of
category C 2 will reinitialize their DMSB and S 2 category C1 are is in one of the states of  1 ,
moves to the state i  1, i  1, D21 , D21  , stations of category C 2 are in one of the states of
i  2..W  1 .
2.
4.3 Blocking probabilities in the Markov chains Hence, we derive the expressions of S1
According to the explanations given in blocking probabilities p 11 and p 12 shown in
paragraphs 4.1 and 4.2, the states of the Markov figure 1 as follows:
chains modeling stations S 1 and S 2 can be divided
into the following groups:  p 11 : the probability that S 1 is blocked given
that S 1 is in one of the states of  1 . p 11 is
  1 : the set of states of S 1 where none of the
the probability that at least a station S 1' of
n 2 stations of category C 2 contends for the
the other n1  1 stations of C1 transmits
channel (blue states in figure 1).
 1  ~ C 2 , i , i  j , D 21 , i  0..W  1, given that S 1' is in one of the states of  1 .
j  0.. minmax0 , i  1, D 21  1 p 11  1  1   11 n1 1 (3)
where  11 is the probability that a station S 1'
  1 : the set of states of S 1 where stations of
of C1 transmits given that S 1' is in one of
category C 2 can contend for the channel
the states of  1 :
(pink states in figure 1).
 1  C 2 , i , i  D 21 , D 21 , i  D 21 ..W  1 
 11  Pr S 1' transmits  1 
~ C2 ,0 ,0 , D21 
1
  2 : the set of states of S 2 where stations of  (4)
W 1  min max 0 ,i 1,D21 1 
category C 2 do not contend for the channel  

i 0 
  1~ C2 ,i ,i  j , D21  

(blue states in figure 2). j 0 

UbiCC Journal - Volume 3 - 18


 1R ,i ,i  j , D21  is defined as the probability of
 p 22 : the probability that S 2 is blocked
the state R , i , i  j , D21 , in the stationary
given that S 2 is in one of the states of  2 .
conditions and  1   1  R ,i ,i  j , D21 
 is the p 22  1  1   12 n1 1   22 n2 1 (9)
probability vector of a category C1 station.
The blocking probabilities described above
 p 12 : the probability that S 1 is blocked allow deducing the transition state probabilities and
given that S 1 is in one of the states of  1 . having the transition probability matrix Pi , for a
p 12 is the probability that at least a station station of traffic category C i .
S 1' of the other n1  1 stations of C1 Therefore, we can evaluate the state
probabilities by solving the following system [11]:
transmits given that S 1' is in one of the states
of  1 or at least a station S '2 of the n 2  i Pi   i

stations of C 2 transmits given that S '2 is in 
  ij  1
 j
(10)
one of the states of  2 . 
p 12  1  1   12 n1 1 1   22 n2 (5) 4.4 Transition probability matrices
4.4.1 Transition probability matrix of a
where  12 is the probability that a station category C1 station
S 1' of C1 transmits given that S 1' is in one Let P1 be the transition probability matrix of
the station S 1 of category C1 . P1 i , j is the
of the states of  1 .

 12  Pr S 1' transmits  1  probability to transit from state i to state j . We
have:
C2 ,D21 ,0 , D21 
1
 W 1
(6) P1 ~ C 2 , i , i  j , D 21 , ~ C 2 , i , i   j  1, D 21 

i  D21
1C2 ,i ,i  D21 , D21 
 1  p11 , i  2..W  1, j  0.. mini  2 , D 21  2 
(11)
P1 ~ C 2 , i ,1, D 21 , ~ C 2 ,0 ,0 , D 21   1  p11 ,
and  22 the probability that a station S 2' of i  1.. minW  1, D 21  1
C 2 transmits given that S 2' is in one of the (12)
states of  2 . P1 ~ C 2 , i , i  D 21  1, D 21 , C 2 , i , i  D 21 , D 21 
 1  p 11 , i  D 21 ..W  1

 12  Pr S 2' transmits  2  (13)
2 0 ,0 ,0 ,D21  P1~ C2 , i , i  j , D21 , ~ C2 , i  j , i  j , D21 
(14)
 W 1 W 1
(7)  p11 , i  2..W  1, j  1.. mini  1, D21  1
 i ,i ,0 ,D21 
2   i ,i 1,0 ,D21 
2
i 0 i2 P1~ C2 , i , i , D21 , ~ C2 , i , i , D21   p11 ,
(15)
i  1..W  1
 2i ,k ,D21  j ,D21  is defined as the probability
of the state i , k , D21  j , D 21 , in the P1C2 ,i ,i  D21 , D21 ,~ C2 ,i  D21 ,i  D21 , D21 
 p12 ,i  D21  1..W  1
stationary condition.  2   2  i ,k ,D21  j ,D21 
 (16)
is the probability vector of a category C 2
station. P1C2 ,i ,i  D21 , D21 ,C2 ,i  1,i  1  D21 , D21 
 1  p12 ,i  D21  1..W  1
In the same way, we evaluate p 21 and p 22 the
(17)
blocking probabilities of station S 2 shown in
figure 2: 1
P1~ C2 ,0 ,0 , D21 , ~ C2 , i , i , D21   ,
 p 21 : the probability that S 2 is blocked W (18)
given that S 2 is in one of the states of  2 . i  0..W  1
p 21  1  1   11  n1
(8)
If D 21  W  then:

UbiCC Journal - Volume 3 - 19


P1C2 , D21 ,0 , D21 , ~ C2 , i , i , D21  
1
,  11  f  11 , 12 , 22 
W (19)   f  , , 
 12 11 12 22
i  0..W  1
 22  f  11 , 12 , 22 
under the constraint
By replacing p11 and p 12 by their values in 
equations (3) and (5) and by replacing P1 and  1  11  0 , 12  0 , 22  0 , 11  1, 12  1, 22  1
in (10) and solving the resulting system, we can (28)
R ,i ,i  j , D21 
express  1 as a function of  11 ,  12 and Solving the above system (28), allows deducing
 22 given respectively by equations (4), (6) and the expressions of  11 ,  12 and  22 , and deriving
(7). the state probabilities of Markov chains modeling
category C1 and category C 2 stations.
4.4.2 Transition probability matrix of a
category C2 station
Let P2 be the transition probability matrix of 5 THROUGHPUT ANALYSIS
the station S 2 belonging to the traffic category C 2 .
The transition probabilities of S 2 are: In this section, we propose to evaluate Bi , the
normalized throughput achieved by a station of
P2 i , i , D21  j , D21 , i , i , D21   j  1, D21  traffic category C i [1]. Hence, we define:
(20)
 1  p21 , i  0..W  1, j  0..D21  1
 Pi ,s : the probability that a station Si
P2 i , i , D21  j , D21 , i , i , D21 , D21   p21 , belonging to traffic category C i transmits a
(21)
i  0..W  1, j  0..D21  1 packet successfully. Let S 1 and S 2 be two
stations belonging respectively to traffic
P2 i , i ,0 , D21 , i , i  1,0 , D21   1  p22 , categories C1 and C 2 . We have:
(22)
i  2..W  1
P1,s  Pr S1 transmits successfully 1  Pr1 
P2 1,1,0 , D21 , 0 ,0 ,0 , D21   1  p22 (23)  Pr S1 transmits successfully  1  Pr 1 
  11 1  p11  Pr 1    12 1  p12  Pr  1 
P2 i , i ,0 , D21 , i , i , D21 , D21   p22 ,
(24) (29)
i  1..W  1
P2 ,s  Pr S 2 transmits successfully  2  Pr  2 
P2 i , i  1,0 , D21 , i  1, i  1, D21 , D21   p22 ,
(25)  Pr S 2 transmits successfully  2  Pr 2 
i  2..W  1
  22 1  p 22  Pr  2 
P2 i , i  1,0 , D21 , i  1, i  2 ,0 , D21   1  p22 , (30)
(26)
i  3..W  1  Pidle : the probability that the channel is idle.
1
P2 0 ,0 ,0 , D21 , i , i , D21 , D21   , i  0..W  1 (27)
W The channel is idle if the n1 stations of
category C1 don’t transmit given that these stations
By replacing p 21 and p 22 by their values in are in one of the states of  1 or if the n stations
equations (8) and (9) and by replacing P2 and  2 (both category C1 and category C 2 stations) don’t
in (10) and solving the resulting system, we can
transmit given that stations of category C1 are in
i ,k ,D21  j ,D21 
express  2 as a function of  11 ,  12 one of the states of  1 . Thus:
and  22 given respectively by equations (4), (6)
R ,i ,i  j , D21 
and (7). Moreover, by replacing  1 and Pidle  1   11 n1 Pr  1   1   12 n1 1   22 n2 Pr  1 
 2i ,k ,D21  j ,D21  by their values, in equations (4), (6) (31)

and (7), we obtain a system of non linear equations Hence, the expression of the throughput of a
as follows: category C i station is given by:

UbiCC Journal - Volume 3 - 20


Pi ,s T p 8 and we depict the throughput achieved by the
Bi  different stations present in the network as a
 2 
PIdle Te  Ps T s   1  PIdle 


n P
i 1
i i ,s
Tc


function of the contention window size W ,
D 21  1 . We notice that the throughput achieved
(32) by category C1 stations (stations numbered from
S 11 to S 14 ) is greater than the one achieved by
Where Te denotes the duration of an empty
category C 2 stations (stations numbered from S 21
slot, Ts and Tc denote respectively the duration of
to S 24 ).
a successful transmission and a collision.
 2 
 1  PIdle 



i 1
ni Pi ,s 


corresponds to the

probability of collision. Finally T p denotes the


average time required to transmit the packet data
payload. We have:

 
Ts  T PHY  TMAC  T p  T D  SIFS 
(33)
TPHY  T ACK  T D   DIFS


Tc  TPHY  TMAC  T p  TD  EIFS  (34)

Where T PHY , TMAC and T ACK are the


durations of the PHY header, the MAC header and
the ACK packet [1], [13]. TD is the time required to Figure 3: Normalized throughput as a function of
transmit the two bytes deadline information. the contention window size D 21  1, n  8 
Stations hearing a collision wait during EIFS before
resuming their packets. Analytically, stations belonging to the same
traffic category have the same throughput given by
For numerical results stations transmit 512 equation (32). Simulation results validate analytical
bytes data packets using 802.11.b MAC and PHY results and show that stations belonging to the same
layers parameters (given in table 1) with a data rate traffic category (either category C1 or category
equal to 11Mbps. For simulation scenarios, the C 2 ) have nearly the same throughput. Thus, we
propagation model is a two ray ground model. The
conclude the fairness of DM between stations of the
transmission range of each node is 250m. The
same category.
distance between two neighbors is 5m. The EIFS
parameter is set to ACKTimeout as in ns-2, where:
For subsequent throughput scenarios, we focus
on one representative station of each traffic
ACKTimeout  DIFS  T PHY  T ACK  T D   SIFS
category. Figure 4, compares category C1 and
(35)
category C 2 stations throughputs to the one
Table 1: 802.11 b parameters. obtained with 802.11.

11 Mb/s Curves are represented as a function of W and


Data Rate
Slot 20 µs for different values of D 21 . Indeed as D 21
SIFS 10 µs increases, the category C1 station throughput
DIFS 50 µs increases, whereas the category C 2 station
PHY Header 192 µs throughput decreases. Moreover as W increases,
MAC Header 272 µs the difference between stations throughputs is
ACK 112 µs reduced. This is due to the fact that the shifting
Short Retry Limit 7 backoff becomes negligible compared to the
contention window size.
For all the scenarios, we consider that we are in
n Finally, we notice that the category C1 station
presence of n contending stations with stations obtains better throughput with DM than with
2
for each traffic category. In figure 3, n is fixed to

UbiCC Journal - Volume 3 - 21


802.11, but the opposite scenario happens to the expression of the average service time and the
category C 2 station. service time distribution. The service time depends
on the duration of an idle slot Te , the duration of a
successful transmission Ts and the duration of a
collision Tc [1], [3],[14]. As Te is the smallest
duration event, the duration of all events will be
T 
given by  event  .
T
 e 

6.1 Z-Transform of the MAC layer service time

6.1.1 Service time Z-transform of a category


C1 station:
Let TS 1 Z  be the service time Z-transform of
a station S 1 belonging to traffic category C1 . We
Figure 4: Normalized throughput as a function of
define:
the contention window size (different D 21 values)
H 1R ,i ,i  j , D21  Z  : The Z-transform of the
In figure 5, we generalize the results for
different numbers of contending stations and fix the time already elapsed from the instant S 1 selects a
contention window size W to 32. basic backoff in 0 ,W  1 (i.e. being in one of the
states ~ C 2 , i , i , D21  ) to the time it is found in the
state R , i , i  j , D 21  .
Moreover, we define:

11
 Psuc : the probability that S 1 observes a
successful transmission on the channel,
while S 1 is in one of the states of  1 .
11
Psuc  n1  1 11 1   11 n1  2 (36)

12
 Psuc : the probability that S 1 observes a
successful transmission on the channel,
while S 1 is in one of the states of  1 .
Figure 5: Normalized throughput as a function of 12
Psuc  n1  1 12 1   12 n1  2 1   22 n2
the number of contending stations (37)
 n2 22 1   22 n2 1 1   12 n1 1
All the curves show that DM performs service
differentiation over 802.11 and offers better We evaluate H 1R ,i ,i  j , D21  Z  for each state
throughput for category C1 stations independently
of S 1 Markov chain as follows:
of the number of contending stations.

  Ts 
1  11  Te 
6 SERVICE TIME ANALYSIS H 1~ C2 ,i ,i , D21 Z    Psuc Z 
W 

In this section, we evaluate the average MAC
 Tc  
layer service time of category C1 and category C 2   min i  D21 1,W 1

stations using the DM policy. The service time is 


p11  Psuc 11
 T 
Z e   H 1~ C 2 ,k ,i , D21  Z 
the time interval from the time instant that a packet  k i 1

becomes at the head of the queue and starts to   Ts   Tc  


 12  Te 
 
 
contend for transmission to the time instant that
 Ĥ 1C 2 ,i  D21 ,i , D21  Z  Psuc Z 12 T
 p11  Psuc Z  e  
either the packet is acknowledged for a successful  
transmission or dropped [3].  
(38)
We propose to evaluate the Z-Transform of the Where:
MAC layer service time [3], [14], [15] to derive an

UbiCC Journal - Volume 3 - 22


 Ĥ 1C ,i  D ,i , D  Z   H 1C ,i  D ,i , D  Z   Ts 

1  p11 H 1~ C ,0 ,0 , D


 
 TS1 Z   Z Z 
2 21 21 2 21 21
 Te 
 if i  D 21   W  1 (39) 2 21 

 m   c
T 
 Ĥ 1C2 ,i  D21 ,i , D21  Z   0 Otherwise  1  p12 H 1C2 ,D21 ,0 , D21  Z      Te 
Z  p11H 1~C ,0 ,0 , D Z 
2 21 

i 0
We also have: 
 p12 H 1C 2 ,D21 ,0 , D21  Z  i

1  p11 Z  j H 1~ C2 ,i ,i , D21  Z  m 1


H 1~ C2 ,i ,i  j , D21  Z     Tc 
 T


 Ts   Tc 

  Z  e  p11 H 1~ C2 ,0 ,0 , D21 Z   p12 H 1C 2 ,D21 ,0 , D21  Z   
 
   
11  Te  11 T
1 Psuc Z  p11  Psuc Z  e   
 
i  2..W  1, j  1..mini  1, D21  1
(40) (44)

1  p11 Z D21 H 1~ C2 ,i ,i , D21 Z  6.1.2 Service time Z-transform of a category
H 1C2 ,i ,i  D21 , D21  Z   C2 station:
 Ts   Tc 

  In the same way, let TS 2 Z  be the service


   
11
1  Psuc Z  Te   p11  11 T
Psuc Z  e 
 1  p12 ZH 1C2 ,i 1,i 1 D21 , D21  Z ,i  D21 ..W  2 time Z-transform of a station S 2 of category C 2 .
We define:
(41)
H 2i ,k ,D21  j ,D21  Z  : The Z-transform of the
H 1C2 ,W 1,W 1 D21 , D21 Z  time already elapsed from the instant S 2 selects a
1  p11 Z  D21
H 1~ C 2 ,W 1,W 1, D21  Z  basic backoff in 0 ,W  1 (i.e. being in one of the
 (42)
 Ts   Tc  states i , i , D 21 , D 21  ) to the time it is found in the
 
   
11
1  Psuc Z  Te  11
 p11  Psuc Z  Te  state i , k , D21  j , D21  .
Moreover, we define:
1  p11 ZH 1~ C2 ,1,1, D21  Z 
H 1~ C2 ,0 ,0 , D21  Z   21
 Ts   Tc   Psuc : the probability that S 2 observes a
 
   
11  Te  11 T
1 Psuc Z  p11  Psuc Z e  successful transmission on the channel,
min W 1,D21 1 while S 2 is in one of the states of  2 .
1
 1  p11 Z  H 1
i2
~ C 2 ,i ,1, D21  Z  
W
21
Psuc  n1 11 1   11 n1 1 (45)
(43)
22
 Psuc : the probability that S 2 observes a
If S 1 transmission state is ~ C 2 ,0 ,0 , D 21  , successful transmission on the channel,
the transmission will be successful only if none of while S 2 is in one of the states of  2 .
the n1  1 remaining stations of C1 transmits. 22
Psuc  n1 12 1   12 n1 1 1   22 n2 1
Whereas when the station S 1 transmission state is (46)
 n2  1 22 1   22 n2  2 1   12 n1
C 2 , D21 ,0 , D 21  , the transmission occurs
successfully only if none of n  1 remaining We evaluate H 2i ,i ,D21  j ,D21  Z  for each state
stations (either a category C1 or a category C 2
of S 2 Markov chain as follows:
station) transmits.
1
H 2i ,i ,D21 ,D21  Z   , i  0 and i  W  1 (47)
If the transmission fails, S 1 tries another W
transmission. After m retransmissions, if the   Ts 
1  22  Te 
packet is not acknowledged, it will be dropped. H 2i ,i ,D21 ,D21  Z     Psuc Z 
Thus, the Z-transform of station S 1 service time is: W 

 Tc  
 Z 
 
 Te 
 H 2i 1,i ,0 ,D21  Z , i  1..W  2
22
p 22  Psuc


(48)

UbiCC Journal - Volume 3 - 23


To compute H 2i ,i ,D21  j ,D21  Z  , we define   Tc  
m 1
   
Z  , such as: TS 2 Z    p 22 Z H 20 ,0 ,0 ,D21  Z 
 Te 
j
Tdec 
 
 
0
Tdec Z   1 (49)  Ts    Tc  
i
  m
   
1  p 21 Z
1  p 22 Z  Te  H 2

0 ,0 ,0 ,D21  Z   p 22 Z

 Te 
H 20 ,0 ,0 ,D21  Z 

j
Tdec Z   i 0
 
  Ts   Tc  
 21  Te 
 
  (55)
1   Psuc Z 21 T j 1
 p 21  Psuc Z  e  Tdec Z 
 
 
6.2 Average Service Time
for j  1..D 21
From equations (44) (respectively equation
(50) (55)), we derive the average service time of a
category C1 station ( respectively a category C 2
So:
station). The average service time of a category C i
station is given by:
H 2i ,i ,D21  j ,D21  Z   H 2i ,i ,D21  j 1,D21  Z  j
Tdec Z ,
X i  TS i1 1 (56)
i  0..W  1, j  1..D21 , i , j   0 , D 21 
(51)
And:
Where TS i1 Z  , is the derivate of the service
H 2i ,i 1,0 ,D21  Z   1  p 22 ZH 2i 1,i ,0 ,D21  Z  time Z-transform of a category C i station [11].
1  p 22 ZH 2i ,i ,0 ,D21  Z  By considering the same configuration as in

  Ts   Tc   figure 3, we depict in figure 5, the average service
 22  Te 
 
 
1   Psuc Z 22 T D21
 p 22  Psuc Z  e  Tdec Z  time of category C1 and category C 2 stations as a
  function of W . As for the throughput analysis,
 
stations belonging to the same traffic category have
i  2..W  2 nearly the same average service value. Simulation
(52) service time values coincide with analytical values
given by equation (56). These results confirm the
H 2W 1,W  2 ,0 ,D21  Z  fairness of DM in serving stations of the same
1  p 22 ZH 2W 1,W 1,0 ,D21  Z  category.

  Ts   Tc   (53)
 22  Te 
 
 
Tdec Z 
22  Te  D21
1   Psuc Z  p 22  Psuc Z
 
 

According to figure 2 and using equation (51),


we have:

H 20 ,0 ,0 ,D21  Z   H 20 ,1,0 ,D21  Z Tdec


D21
Z 
1  p 22 ZH 21,1,0 ,D21  Z 

  Ts   Tc   (54)
 22  Te 
 
 
Tdec Z 
22  Te  D21
1   Psuc Z  p 22  Psuc Z
 
 
Figure 6: Average service time as a function of the
contention window size D 21  1, n  8 
Therefore, we can derive an expression of S 2
Z-transform service time as follows:
In figure 7, we show that category C1 stations
obtain better average service time than the one
obtained with 802.11 protocol. Whereas, the
opposite scenario happens for category C 2 stations

UbiCC Journal - Volume 3 - 24


independently of n , the number of contending exceeds 0.01s equals 0.2%. Whereas, station S 2
stations in the network. service time exceeds 0.01s with the probability
57,6%. Thus, DM offers better service time
guarantees for the stations with the highest priority.

In figure 9, we double the size of the contention


window size and set it to 64. We notice that
category C1 and category C 2 stations service time
curves become closer. Indeed, when W becomes
large, the BAB values increase and the DMSB
becomes negligible compared to the basic backoff.
The whole backoff values of S 1 and S 2 become
closer and their service time accordingly.

Figure 7: Average service time as a function of the


number of contending stations

6.3 Service Time Distribution

Service time distribution is obtained by


inverting the service time Z transforms given by
equations (44) and (55). But we are most interested
in probabilistic service time bounds derived by
inverting the complementary service time Z
transform given by [11]:
Figure 9: Complementary service time distribution
~ 1  TS i Z  for different values of D21 ( W  64 )
X i Z   (56)
1 Z
In figure 10, we depict the complementary
In figure 8, we depict analytical and simulation service time distribution for both category C1 and
values of the complementary service time category C 2 stations and for different values of n ,
distribution of a category C1 and a category C 2
the number of contending nodes.
stations for different values of D21 and W  32  .

Figure 10: Complementary service time


Figure 8: Complementary service time distribution distribution for different values of the contending
for different values of D21 , W  32  stations

All the curves drop gradually to 0 as the delay Analytical and simulation results show that
increases. Category C1 stations curves drop to 0 complementary service time curves drop faster
when the number of contending stations is small for
faster than category C 2 curves. Indeed, when both category C1 and category C 2 stations. This
D21  4 slots, the probability that S 1 service time means that all stations service time increases as the

UbiCC Journal - Volume 3 - 25


number of contending nodes increases. by different traffic categories stations as a function
of the minimum contention window size CW min
such as CW min is always smaller than CW max ,
7 EXTENTIONS OF THE ANAYTICAL
RESULTS BY SIMULATION CW max  1024 and K =1.
Analytical and simulation results show that
The mathematical analysis undertaken above throughput values increase with stations priorities.
showed that DM performs service differentiation Indeed, the station with the lowest delay bound has
over 802.11 protocol and offers better QoS the maximum throughput.
guarantees for highest priority stations
Nevertheless, the analysis was restricted to two Moreover, figure 12 shows that stations
traffic categories. In this section, we first generalize belonging to the same traffic category have the
the results by simulation for different traffic same throughput. For instance, when n is set to 15
categories. Then, we consider a simple multi-hop (i.e. m  3 ), the three stations each traffic category
and evaluate the performance of the DM policy have almost the same throughput.
when the stations belong to different broadcast
regions.

7.1 Extension of the analytical results


In this section, we consider n stations
contending for the channel in the same broadcast
region. The n stations belong to 5 traffic categories
where n  5m and m is the number of stations of
the same traffic category. A traffic category C i is
characterized by a delay bound Di , and
Dij  Di  D j is the difference between the
deadline values of category C i and category C j
stations. We have:
Dij  i  j K (57) Figure 12: Normalized throughput: different
Where K is the deadline multiplicity factor stations belonging to the same traffic category
and is given by:
Di 1,i  Di 1  Di  K (58) In figure 13, we depict the average service
time of the different traffic categories stations as a
function of K , the deadline multiplicity factor. We
Indeed, when K varies, the Dij the difference notice that the highest priority station average
between deadline values of category C i and service time decreases as the deadline multiplicity
category Cj stations also varies. Stations factor increases. Whereas, the lowest priority
station average service time increases with K .
belonging to the traffic category C i are numbered
from S i 1 to S im .

Figure 13: Average service time as a function of


Figure 11: Normalized throughput for different the deadline multiplicity factor K
traffic category stations
In the same way, the probabilistic service time
In figure 11, we depict the throughput achieved bounds offered to S 11 (the highest priority station)

UbiCC Journal - Volume 3 - 26


are better than those offered to station S 51 (the Flows packets are routed using the Ad-hoc On
lowest priority station). Indeed, the probability that Demand (AODV) protocol. Flows F1 and F2 are
S 11 service time exceeds 0.01s=0.3%. But, station respectively transmitted by stations S 1 and S 2
S 51 service time exceeds 0.01s with the probability with delay bounds D1 and D2 and
of 36%. D 21  D 2  D1 =5 slots. Flows F3 and F4 are
transmitted respectively by S 3 and S 4 and have
the same delay bound. Finally, F5 and F6 are
transmitted respectively by S 5 and S 6 with delay
bounds D5 and D6 and D65  D6  D5 = 4 slots.

Figure 16 shows that the throughput achieved


by F1 is smaller than the one achieved by F2 .
Indeed, both flows cross nodes 6 and 7, where F1
got a higher priority to access the medium than F2
when the DM policy is used. We obtain the same
results for flows F and F . Flows F3 and F4
5 6
Figure 14: Complementary service time have almost the same throughput since they have
distribution CWmin  32 , n  8  equal deadlines.

The above results generalize the analytical


model results and show once again that DM
performs service differentiation over 802.11 and
offer better guarantees in terms of throughput,
average service time and probabilistic service time
bounds for flows with short deadlines.

7.2 Simple Multi hop scenario


In the above study, we considered that
contending stations belong to the same broadcast
region. In reality, stations may not be within one
hop from each other. Thus a packet can go through
several hops before reaching its destination. Hence,
factors like routing protocols or interferences may
preclude the DM policy from working correctly. Figure 16: Normalized throughput using DM
policy
In the following paragraph, we evaluate the
performance of the DM policy in a multi-hop Figure 17 show that the complementary
environment. Hence, we consider a 13 node simple service time distribution curves drop to 0 faster for
mtlti-hop scenario described in figure 15. Six flows flow F1 than for flow F2 .
are transmitted over the network.

Figure 17: End to end complementary service time


Figure 15: Simple multi hop scenario distribution

UbiCC Journal - Volume 3 - 27


The same behavior is obtained for flow F5 and (PHY) specification, IEEE (1999).
F6 , where F5 has the shortest delay bound. IEEE 802.11 WG: Draft Supplement to Part 11:
Wireless Medium Access Control (MAC) and
physical layer (PHY) specifications: Medium
Hence, we conclude that even in a multi-hop Access Control (MAC) Enhancements for
environment, the DM policy performs service Quality of Service (QoS), IEEE 802.11e/D13.0,
differentiation over 802.11 and provides better QoS (January 2005).
guarantees for flows with short deadlines. J. Deng, R. S. Chang: A priority Scheme for IEEE
802.11 DCF Access Method, IEICE
Transactions in Communications, vol. 82-B,
8 CONCLUSION no. 1, (January 1999).
J.L. Sobrinho, A.S. Krishnakumar: Real-time traffic
In this paper we proposed to support the DM over the IEEE 802.11 medium access control
policy over 802.11 protocol. Therefore, we used a layer, Bell Labs Technical Journal, pp. 172-
distributed backoff scheduling algorithm and 187, (1996).
introduced a new medium access backoff policy. J. Y. T. Leung, J. Whitehead: On the Complexity of
Then we proposed a Markov Chain based Fixed-Priority Scheduling of Periodic, Real-
mathematical model to evaluate the performance of Time Tasks, Performance Evaluation
the DM policy in terms of throughput , average (Netherlands), pp. 237-250, (1982).
medium access delay and medium access delay K. Duffy, D. Malone, D. J. Leith: Modeling the
distribution. Analytical and simulation results 802.11 Distributed Coordination Function in
showed that DM performs service differentiation Non-saturated Conditions, IEEE/ACM
over 802.11 and offers better guarantees in terms of Transactions on Networking (TON),
throughput, average service time and probabilistic Vol. 15 , pp. 159-172 (February 2007)
service time bounds for flows with small deadlines. L. Kleinrock: Queuing Systems,Vol. 1: Theory,
Moreover, DM achieves fairness between stations Wiley Interscience, 1976.
belonging to the same traffic category. P. Chatzimisios, V. Vitsas, A. C. Boucouvalas:
Throughput and delay analysis of IEEE 802.11
Then, we extended by simulation the analytical protocol, in Proceedings of 2002 IEEE 5th
results obtained for two traffic categories to International Workshop on Networked
different traffic categories. Simulation results Appliances, (2002).
showed that even if contending stations belong to P.E. Engelstad, O.N. Osterbo: Delay and
K traffic categories, K  2 , the DM policy offers Throughput Analysis of IEEE 802.11e EDCA
better QoS guarantees for highest priority stations. with Starvation Prediction, In proceedings of
Finally, we considered a simple multi-hop scenario the The IEEE Conference on Local Computer
and concluded that factors like routing messages or Networks , LCN’05 (2005).
interferences don’t impact the behavior of the DM P.E. Engelstad, O.N. Osterbo: Queueing Delay
policy and DM still provides better QoS guarantees Analysis of 802.11e EDCA, Proceedings of
for stations with short deadlines. The Third Annual Conference on Wireless On
demand Network Systems and Services
(WONS 2006), France, (January 2006).
9 REFERENCES P.E. Engelstad, O.N. Osterbo: The Delay
Distribution of IEEE 802.11e EDCA and
G. Bianchi: Performance Analysis of the IEEE 802.11 DCF, in the proceeding of 25th IEEE
802.11 Distributed Coordination Function, International Performance Computing and
IEEE J-SAC Vol. 18 N. 3, (March 2000). Communications Conference (IPCCC’06),
H. Wu1, Y. Peng, K. Long, S. Cheng, J. Ma: (April 2006), USA.
Performance of Reliable Transport Protocol The network simulator ns-2,
over IEEE 802.11 Wireless LAN: Analysis and http://www.isi.edu/nsnam/ns/.
Enhancement, In Proceedings of the IEEE Y. Xiao: Performance analysis of IEEE 802.11e
INFOCOM’02, June 2002. EDCF under saturation conditions, Proceedings
H. Zhai, Y. Kwon, Y., Fang: Performance Analysis of International Conference on
of IEEE 802.11 MAC protocol in wireless Communication (ICC’04), Paris, France, (June
LANs, Wireless Computer and Mobile 2004).
Computing, (2004). V. Kanodia, C. Li: Distribted Priority Scheduling
I. Aad and C. Castelluccia: Differentiation and Medium Access in Ad-hoc Networks”,
mechanisms for IEEE 802.11, In Proc. of IEEE ACM Wireless Networks, Volume 8,
Infocom 2001, (April 2001). (November 2002).
IEEE 802.11 WG: Part 11: Wireless LAN Medium
Access Control (MAC) and Physical Layer

UbiCC Journal - Volume 3 - 28


TEMPORAL INFORMATION SYSTEMS AND THEIR APPLICATIONS
TO MOBILE AD HOC ROUTING

V. Mary Anita Rajam, V.Uma Maheswari and Arul Siromoney


Department of Computer Science and Engineering,
Anna University, Chennai -- 600 025, India

Contact email: [email protected]

ABSTRACT
A Temporal Information System, that incorporates temporal information into the
traditional Information System of Rough Set Theory (RST) and Variable Precision
Rough Sets (VPRS), is presented. Mobile Ad hoc Networks (MANETs)
dynamically form a network without an existing infrastructure. The Dynamic
Source Routing (DSR) protocol of MANETs is modified in this paper to use recent
routes. Weighted elementary sets are introduced in temporal information systems
and used to route packets in mobile ad hoc networks. Notions from VPRS are also
brought into weighted temporal information systems and used in routing. The
performance of these proposed routing protocols is studied.

Keywords: Rough Set Theory, Mobile ad hoc networks, Temporal Information


System, Routing.

1 INTRODUCTION symptoms and tests; and the decisions (or decision


attribute) may be diseases.
Rough Set Theory is a mathematical tool
that deals with vagueness and uncertainty. Mobile ad In an information system, elements that
hoc networks are a collection of wireless mobile have the same value for each attribute are
nodes that can dynamically form a network. The indiscernible and are called elementary sets. Subsets
state of each link in a mobile ad hoc network of the universe with the same value of the decision
changes with time. attribute are called concepts. A positive element is an
element of the universe that belongs to the concept.
This paper introduces temporal information
systems that are then applied to mobile ad hoc For each concept, the greatest union of
routing. elementary sets contained in the concept is called the
lower approximation of the concept and the least
Each mobile node maintains a route cache union of elementary sets containing the concept is
of known routes. It is shown in this paper that giving called the upper approximation of the concept. The
more importance to the recent routes in the route set containing the elements from the upper
cache is useful. This has led to the notion of approximation of the concept that are not members
weighted elementary sets in temporal information of the lower approximation is called the boundary
systems, where more recent elementary sets are region. The lower approximation of the concept is
given more importance. also known as the positive region.

1.1 Rough Set Theory A set is said to be rough if the boundary


region is non-empty. A set is said to be crisp if the
In Rough Set Theory (RST) [21], boundary region is empty.
introduced by Zdzislaw Pawlak, a data set is
represented as a table, where each row represents an Variable Precision Rough Sets (VPRS)
event or an object or an example or an entity or an [31], proposed by Ziarko, is a generalization of the
element. Each column represents an attribute that can rough set model, aimed at modelling classification
be measured for an element. This table is called an problems involving uncertain or imprecise
information system. The set of all elements is known information. Classification with a controlled degree
as the universe. For example, if the information of uncertainty is possible with this model. It is
system describes a hospital, the elements may be possible to generalize conclusions obtained from a
patients; the attributes (condition attributes) may be smaller set of observations to a larger population.

UbiCC Journal - Volume 3 29


In RST, the lower approximation of a stamp is introduced as an attribute, in addition to the
concept is defined using an inclusion relation. Here condition attributes and the decision attribute of the
in VPRS, the lower approximation is defined using a traditional information system.
majority inclusion relation. The β-positive region is
the union of elementary sets which are either A temporal information system is
completely contained in the concept or are almost introduced by Bjorvand [29]. The temporal
contained in the concept, with a maximum error of information system has a sequence attribute in
1 – β. addition to the condition attributes and the decision
attribute present in the traditional information
The conditional probability of an element system. The value of the sequence attribute is an
being positive in an elementary set is the probability integer, based on the time of occurrence of the
that the element is positive, given that the element objects in the information system. Methods are
belongs to that elementary set. It is the ratio of the proposed in [29] and [28] to convert this temporal
number of positive elements in that elementary set to information system into the traditional information
the number of elements in that elementary set. When system, so that rough set techniques can be applied.
this conditional probability is greater than a threshold The method proposed in [29] depends on time
β (0.5 < β ≤ 1) the elementary set is said to fall in the intervals that must be fixed and defined in advance.
β-positive region. Trend expressions [28] are added to transform the
traditional method of translating a temporal
1.2 Rough Sets in Temporal Contexts information system into an information system. So, a
new attribute is added to the information system, for
A temporal system is a time based system which values are set based on the trend. A real time
which shows the temporal variation of some specific temporal information system is proposed in [29], in
data or attribute. Time-series data are a kind of which the difference between the time of occurrence
temporal data and are results of some observations of the current row and the previous row (if rows are
usually ordered in time. Time-series data often sorted according to time) is also stored in the
possess content that is conflicting and redundant. information table.
The data may be imprecise; therefore a precise
understanding of information cannot be derived from Having a linearly ordered universe, based
the data. Rough set theory offers a powerful toolset on time, is also used in bringing in the notion of time
for confronting this situation. into a rough set information system. This temporal
information system with a linearly ordered universe
Analysis of time-series data and [19], [8], [1], [3], [20], [2] provides information
constructing suitable data from the time-series that about the behaviour of objects in time and state of an
can be used by rough sets are investigated in [15], object is described by some attributes. The elements
[18], [11]. Reducts can be found from the original can be the behaviour over time of the same object,
data and rules can be generated from the acquired multiple objects, or objects independent of each
reducts using rough sets [15]. For constructing other.
suitable data from the time-series, different methods
are tried. In the mobile window method [4], a Temporal templates [19], which are
window is moved along the time-series; the data homogeneous patterns occurring in some periods, are
points falling into the window are transferred into a extracted from temporal information systems. The
rough sets object. The window method poses temporal templates are then used to discover
restrictions on how far back in time dependencies behaviour of temporal features of objects.
can be traced. In the columnizing method [18], the
time series are organized in columns, such that each Using temporal templates [20], a temporal
row represents an object where each column is an multiple information system [19] is introduced that
economic indicator, and each row represents a describes many objects along the time axis e.g.
different point in time. several users visiting a website. For each object, a
sequence of temporal templates is found. Collections
In [11], time-series is represented using a of patterns, called episodes, appearing in sequences
series of events or states. An event is something that frequently are found. New attributes are generated
occurs, and is associated with a time. The values of from the found frequent episodes.
attributes are trends, rather than values measured at
points in time, so that dependencies can be traced Temporal patterns that can be potentially
back as long as required. specific for one musical instrument or a group of
instruments are searched [8]. Such patterns are used
A decision table for time-series called a as new descriptors. A time window based technique
time series decision table, is proposed in [16]. A time is used to measure the values of the descriptors in

UbiCC Journal - Volume 3 30


time. Optimal temporal templates that respond to 1.3 Application of Rough Sets to Computer
temporal patterns are then determined. From the Networks
temporal templates, episodes (collections of
templates that occur together) are found. Very little work has been done in the
application of rough set theory to Mobile Ad Hoc
Information maps can be constructed from Networks and Mobile Ad Hoc Routing. A few papers
data represented by temporal information systems have applied rough set theory to networks. Rough set
[1]. The temporal information system at the current theory is used in intrusion detection in computer
time t is also viewed as a family of decision systems networks [6], [14] and [32]. Rough set approach is
[1], where the universe of the decision system at time also applied to the flow control of a UDP-based file
t1, is a subset of the decision system at time t2, for transfer protocol [33] to accomplish a real time data-
t1 < t2. transferring rate adjustment task; to network fault
diagnosis [24]; and to achieve the rules for object
Some work is based on how the data varied recognition and classification for mobile robots [12].
over a particular duration of time (say, two years, Rough set logics is combined with artificial neural
two months etc.). Different attributes are assigned networks for failure domain exploration of
for each time period in [30]. telecommunication network [7]. The basic concepts
of rough set theory are illustrated using an example
A dynamic information system model concerning churn modeling in telecommunications
based on time sequence is proposed in [13]. The [23].
attributes in the information system can have
different values at different time points. 1.4 Overview of the Paper

1.3 Routing in Mobile Ad Hoc Networks This paper presents a Temporal Information
System that brings temporal information into the
Sending data from a source mobile node to traditional Information System of Rough Set Theory
a destination mobile node through a route (or and Variable Precision Rough Sets. The DSR
sequence of intermediate mobile nodes) is routing. protocol is modified to study the use of recent routes.
Routing is one of the most difficult issues in mobile The paper then introduces the notion of weighted
ad hoc networks. Each node in an ad hoc network is elementary sets in temporal information systems and
responsible for routing. Hence each node should uses this to route packets in mobile ad hoc networks.
maintain the information necessary for routing, that This paper then uses VPRS in weighted temporal
is, the next hop or the path through which the data information systems and uses this in routing. The
has to be routed to the destination. This information performance of these proposed routing protocols are
is either available even before it is needed studied.
(proactive) or is got only when necessary (reactive).
2 TEMPORAL INFORMATION SYSTEMS
Proactive routing is usually done using table
driven protocols where tables are formed initially, 2.1 Information Systems and Decision Systems
and are updated either periodically or when some
change is known. Consider a universe U of elements. An
information system I is defined as I = (U, A, V, ρ),
Reactive routing protocols are also known where
as on-demand routing protocols. In this kind of A is a non-empty, finite set of attributes;
protocols, a route is found for a destination only is the set of attribute values
when there is a need to send information to the of all attributes,
destination. Each node does not have knowledge of where is the set of possible values of
the location of all other nodes in the network. All attribute ;
nodes when they learn or use routes, store the routes is an information function,
to the destinations in a routing table or a cache. such that for every element ,
is the value of attribute for
Most of the on-demand routing protocols element .
[5], [25], [9], discover routes when needed by The information system can also be viewed as an
flooding route request packets in the network. A information table, where each element
route reply is sent back to the source node by either corresponds to a row, and each attribute
an intermediate node that has a route to the corresponds to a column.
destination or by the destination.
I = (U, A, V, ρ), is known as a decision
system, when an attribute is specified as the

UbiCC Journal - Volume 3 31


decision attribute. A decision system is used for elementary sets whose conditional probability is
predicting the value of the decision attribute. greater than or equal to where 0.5. The -
is known as the set of condition attributes. negative region is the union of the elementary sets
The concept is the set of elements of that whose conditional probability is less than where
have a particular value (say, ) of the decision 0.5. These are based on the definitions in [34].
attribute . That is, When , we denote it as , and note that
Normally, is a boolean attribute that takes one of .
two possible values. When is known as
a multi-valued decision attribute. The range of is (0.5,1] in the original
VPRS definition. This indicates probabilistically that
These definitions are based on the definition the elementary set is positive, when the decision
of Rough Set Information System in [21, 22, 17]. attribute is boolean. It appears that when the decision
attribute is multi-valued with as the number of
2.2 Regions of the Universe possible values, the range of is (1/k,1].

An equivalence relation , called 2.3 Temporal Extensions


indiscernibility relation, is defined on the universe
as A Generic Temporal Information System
(GTIS) is defined as a set of information tables,
with each information table
located at a time on the time
In the information system , the elementary axis. A time interval can also be considered instead
set containing the element with respect to the of a time instance.
indiscernibility relation , is
A special case is when , that is, the
same universe of elements that appears in each
information table. A particular element , for
The lower approximation of the concept each attribute , would have a value
, with respect to and equivalence relation at each time . For example, patient $x$
on , is the union of the elementary sets of with has fever at , and does not have fever at . This
respect to that are contained in , and is denoted special case is when each .
as
In this paper, this is treated as a single
The upper approximation of is the union Temporal Information System defined as
of the elementary sets of with respect to that , where is the time attribute with
have a non-zero intersection with , and is denoted as a set of pairs with
as as a sequence of time instances,

The lower approximation of is also


known as the Positive region of . The set For each elementary set that is formed from
is called the Boundary region the set of attributes (ie the set of attributes without
of . The set is called the Negative region of the time information), there are now elementary
. sets, where the first elementary set consists of
elements that occurred between time instance and
The conditional probability that an element , the second elementary set of elements between
in an elementary set is positive is and , and the last elementary set of elements
between and . This can be pictured as vertical
blocks of elementary sets along a time axis.

The conditional probability that the element 2.4 Information System in a Mobile Node
in the elementary set is negative is
The use of an information system in mobile
ad hoc routing was introduced in [26]. The
information system was modified in [27] to represent
When the context is clear, the conditional probability the route better, by using the link information rather
of an elementary set is taken to be . than the node information. A threshold was used in
the identification of a good next hop.
The -positive region is the union of the

UbiCC Journal - Volume 3 32


Let be a set of mobile nodes. A route is a hop. The decision attribute is taken as the next hop,
path through mobile nodes in and is denoted as a and the predicted next hop is also known as the
sequence of mobile nodes predicted value of the decision attribute.
Each mobile node maintains a
route cache that stores all the routes that knows. The destination can possibly be reached
Any route in the route cache is a path starting from through several different sequences of intermediary
that mobile node , and so , the first node in the nodes. In other words, several different
route, is itself. Any route in the route cache is a combinations of attribute values make it possible for
simple path, where no node repeats, that is, the destination to be reached. That is, elements in
for in the path, . So, . several different elementary sets correspond to routes
that lead to this particular destination. Thus several
Each mobile node has an elementary sets play a role in identifying the best
information table associated next hop for a particular destination. So, it is not
with it. Each row in the information table possible to use a single elementary set to predict the
corresponds to a route in the route cache maintained value of the decision attribute. The union of these
by that mobile node . elementary sets is used. In other words, for a
particular destination, the union is taken of all the
Let be the set of all possible links elementary sets that correspond to valid routes from
between the nodes. Each condition attribute the current mobile node to the destination.
corresponds to a particular link in the set of all
possible links between the nodes. So, is the same A stringent method of predicting the next
hop is when all the elements in this union of
for any and is denoted as . Each condition
elementary sets have the same value of the decision
attribute is a boolean attribute, with , and
attribute, then this value is taken as the predicted
is set to 1 or 0 depending on whether or not that link
next hop. In other words, all known routes to this
is present in the route corresponding to that element.
destination should have this particular node as the
So, since for every , and
next hop. This can also be considered as that value of
is denoted as since it is the same in each mobile
the decision attribute for which all these elementary
node .
sets are in its lower approximation. It is to be
remembered that the decision attribute is a multi-
A mobile node knows a route either because
valued attribute, and so the lower approximation is
the route is in a packet that passes through this
with respect to a value of the decision attribute.
mobile node, or because this mobile node is in
promiscuous mode, and this route is in a packet that
Another method is to have the predicted
passes between two nodes that are within range of
next hop as that value of the decision attribute where
this mobile node. When a mobile node knows a route
the union of these elementary sets is in the -positive
it is added to the route cache only if it is not identical
region. The conditional probability is determined
to a path or a sub-path of any other route already
using the union of elementary sets, and not a single
present in the route cache. However, every time a
elementary set. The probability that a particular next
mobile node knows a route, a row corresponding to
hop occurs given that the route leads to a particular
this route is always added to its information table.
destination is taken as the conditional probability.
This conditional probability should be greater than a
Consider an element x corresponding to
threshold . In other words, a large number of
a route . When a
known routes to this destination have this particular
row is added to the information table, the values of
node as the next hop.
the condition attributes corresponding to the links
are set as 1.
2.6 Temporal Decision System in a Mobile Node
2.5 Decision System in a Mobile Node
In a Temporal Information System (TIS) for
a mobile node, each element (corresponding to a
In traditional Rough Set Theory and VPRS,
route) has a particular value of the time attribute, that
the value of the decision attribute of a new element
is, each element falls in a particular time interval.
(or unknown element or test case) is predicted, based
This is determined by the time stamp of the next hop
on which elementary set it falls into. The elementary
of the route that corresponds to this element. The
set into which it falls is determined by the values of
Temporal Decision System (TDS) in a mobile node
the attributes of that element.
is used to predict the next hop.
The decision system in a mobile node is
An appropriate method (as described in the
used to predict the next hop for a particular
previous section) is used to determine the predicted
destination. This next hop is called the predicted next

UbiCC Journal - Volume 3 33


next hop in each time interval. The predicted next route in the route cache, it initiates a route discovery,
hop for the TDS is then determined based on the and gets back a route reply with the route to the
number of time intervals in which it is the predicted destination. This source route is placed in the data
next hop. packet and the data packet is sent to the next hop in
the route.
The predicted value of the decision attribute
is determined from the TDS based on the probability When a data packet reaches an intermediate
of a particular value of the decision attribute being node, the source route in the data packet is used to
the predicted value in the different time intervals. forward the data packet to the next hop. If a node
This probability is the number of time intervals in while sending the data packet to the next hop, finds
which that value of the decision attribute is the that the link does not exist, it uses the shortest route
predicted value divided by the total number of time present in its route cache to that destination. If the
intervals. The predicted value of the decision route is not found in the route cache, route discovery
attribute is the value for which this probability is is done.
greater than a threshold . In other words, that
particular next hop has been the predicted next hop When a route discovery is required, the
in most of the time intervals. node broadcasts route request packets. If any
intermediate node receiving the route request has a
In Weighted Temporal Information Systems route to the required destination in its route cache, it
(WTIS) and Weighted Temporal Decision Systems sends a route reply to the initiator of the route
(WTDS), weights are assigned to the discovery. Else, the node appends its own address to
elementary sets between time instances and , the the route in the route request and re-broadcasts the
elementary sets between and , and the route request. If the route request reaches the
elementary sets between and , respectively. destination, the destination reverses the route and
The predicted value of the decision attribute is sends back a route reply to the initiator of the route
determined after associating weights with the time discovery.
intervals. The predicted value of the decision
attribute is that value of the decision attribute where When a path is to be added to the cache, the
the probability is greater than a threshold . The following are done. If a prefix of the path to be
probability is the sum of the weights of time intervals added is present in the cache, the rest of this path is
in which that value of the decision attribute is the appended to the path present in the cache. If the
predicted next hop divided by the sum of the weights whole of the new path to be added is not present in
of all the time intervals. the cache, the path is added to the cache. If there is
no free space in the cache, then a victim is picked up
When the more recent time intervals play a and the route to be added is put in the victim's place.
more important role, the weight of a more recent If there is any path or a subpath in the cache entry
time interval is higher than the weight of a less that is a prefix or the same as that of the path that is
recent time interval. added, then the time stamps of those links are set to
be equal to the time stamps of the links in the path
3 MOBILE AD HOC ROUTING USING that is added.
RECENT ROUTES
When a link error occurs, a route error with
This section first describes the original information about the dead link is sent to the original
Dynamic Source Routing (DSR) protocol, an on- sender of the data packet. Paths, or subpaths starting
demand routing protocol. The DSR protocol is then with the given dead link, are removed from the cache
modified so that the most recent applicable route in in nodes that receive the route error.
the route cache is used at each intermediate node.
The performance of the modified protocol is 3.2 DSRrecent
evaluated.
This section describes the proposed
3.1 Dynamic Source Routing DSRrecent protocol and the modifications made to
the existing DSR protocol.
In Dynamic Source Routing (DSR) [5],
each mobile node has a route cache to store the In the source node, in DSR, the shortest
routes that are known to that mobile node. route in the route cache is used, whereas in
DSRrecent, the route to the destination in the route
The source node is the node that wants to cache, that has the most recent next hop, is used
send a data packet. It uses the shortest route present (using algorithm findRecentRoute).
in its route cache to the destination. If there is no

UbiCC Journal - Volume 3 34


In intermediate nodes, in DSR, the source (iv) Average hop count: The average
route is used to determine the next hop. Only if there number of hops from the source node to the
is a link error, the route cache is used. The shortest destination node.
route to the destination in the route cache is then
placed in the data packet, instead of the original The network simulator ns2 [10] is used for
source route. the experiments. The following parameters are ones
However, in DSRrecent, the best route is that have been often used in such studies. The
determined from the route cache (using algorithm random waypoint mobility model is used in a
findRecentRoute). The route in the data packet is rectangular field. Constant bit rate traffic sources are
then modified such that the route from the route used. The radio model in the simulator is based on
cache replaces the subpath (from the current node) in the Lucent Technologies WaveLAN 802.11.
the route in the data packet. providing a 2Mbps transmission rate. A transmission
range of 250 m is used. The link layer modeled is the
If no route is found (to the destination) in Distributed Coordination Function (DCF) of the
the route cache of the intermediate node, or if the IEEE 802.11 wireless LAN standard. The source-
found route will result in a loop, the data packet is destination pairs (connections) are spread randomly
forwarded according to the existing route in the data over the network. 512 byte data packets are used.
packet.
Nodes move in a field with dimensions
If a link error occurs in any node, the best 1500 m X 300 m with a maximum speed of 2 m/sec.
route found in the route cache (using algorithm The pause time is 20 seconds. The number of nodes
findRecentRoute) is used. If there is no route in the is kept fixed as 50. The number of communicating
route cache, route discovery is done. source-destination pairs is varied from 5 to 40.
Simulations are run for 1000 simulated seconds.

findRecentRoute() {
besttime = 0;
Packet Delivery Ratio (%)

bestroute = NULL; 120


foreach possible-route do 100
t = time of next hop in possible-route; 80
DSR
if t > besttime then 60
besttime = t; DSRrec
40
bestroute = possible-route; 20
end 0
end 5 10 20 30 40
return bestroute;
Number of Connections
}

3.3 Performance Evaluation Figure 1: Packet delivery ratio vs. number of


connections for DSR and DSRrecent
The performance of DSRrecent is evaluated
using the following metrics, that are normally used in
such studies:
6
Normalized control

(i) Packet delivery ratio: The ratio of the 5


overhead

data packets delivered to the application layer of the 4


DSR
destination to those sent by the application layer of 3
DSRrec
the source node. 2
1
(ii) Normalized control overhead : The ratio 0
of the number of control packets sent to the number 5 10 20 30 40
of data packets received in the application layer.
Number of connections

(iii) Average end-to-end delay: The average


delay from when a packet is sent by the source node
until it is received by the destination node. Figure 2: Normalized control overhead vs. number of
connections for DSR and DSRrecent

UbiCC Journal - Volume 3 35


The packet delivery ratios for DSR and more recent time intervals are assigned higher
DSRrecent are very similar for 5 and 10 connections. weights than the less recent time intervals.
When the number of connections is increased from
20 to 40 there is an improvement in the packet The use of the WTIS to predict the value of
delivery ratio from 3% to 11% (Fig. 1). The the decision attribute has already been described in
normalized control overhead for DSRrecent is more section 2.6. This uses the predicted value of the
than that for DSR when the number of sources is 5. decision attribute in different time intervals. The
With increase in the number of connections from 10 experiment described here uses a simple approach to
to 40, there is an improvement of 4% to 25% over determine the predicted value of the decision
DSR (Fig. 2). In average hop length and average attribute in a particular time interval. A predicted
end-to-end delay, DSRrecent is seen to perform value of the decision attribute in a time interval has
worse than DSR as the number of connections is atleast one element, with that value of the decision
increased (Fig. 3, Fig. 4). attribute, in the union of elementary sets. That is, the
union is in the upper approximation for that value of
the decision attribute.

18
Average hop count

16
14 4.1 Routing Based on WTIS
12
10 DSR
8
Here, the route cache of the mobile node is
DSRrec used as the WTIS. Routes that are learnt and used are
6
4 added to the cache of the mobile node. When routes
2
0 are added, the time stamp of each link is added along
with the routes. However, unlike DSR, even if the
5 10 20 30 40
same route is present in the cache earlier, the new
Number of connections route is added with the new stamp stamps. So, the
cache now has the same route multiple times, but
with different time stamps.
Figure 3: Average hop count vs. number of
connections for DSR and DSRrecent In the source node, initially, as in DSR, a
shortest route in the route cache, if available, is
placed as the source route in the data packet. If not
available, route discovery is done.
Average end-to-end delay

10
9
8 Then in the source node, and in any
7 intermediate forwarding node, the WTIS is used to
6 DSR
5 determine the best next hop (using algorithm
4 DSRrec
3 findWeightBasedHop). If the next hop is found, and
2 does not result in a loop, the data packet will be
1
0 forwarded to this next hop. If this next hop is
different from the one in the source route that is
5 10 20 30 40
already in the data packet, this new next hop is
Number of Connections appended to the source route in the data packet at the
current node and the route is invalidated by setting a
flag in the data packet.
Figure 4: Average end-to-end delay vs. number of
connections for DSR and DSRrecent If a next hop cannot be determined from the
WTIS, or if the next hop results in a loop, if the
source route in the data packet has not been
4 MOBILE AD HOC ROUTING USING invalidated earlier, the data packet is forwarded
WEIGHTED TEMPORAL INFORMATION according to the source route. Else, a route discovery
SYSTEMS (WTIS) is done.

In Temporal Information Systems, each The total time is divided into time intervals.
elementary set is associated with a particular time The list of next hops to the destination that are
interval. In Weighted Temporal Information present in the route cache is found. For each possible
Systems, elementary sets in different time intervals next hop, from the current time interval till the initial
have weights. Since it is seen in the previous section time interval, a weighted sum of the number of times
that the recent route in the route cache is useful, that the particular next hop is used is found. More

UbiCC Journal - Volume 3 36


weight is assigned if the next hop has been used in The average hop length and the average
the recent past. That is, the weights assigned end-to-end delay for TIME_WT are more than that
decrease for earlier time intervals. for DSR when the number of sources is 5 and 10.
But when the number of connections is increased
from 20 to 40, it is seen that there is a slight
improvement of about 2% in average hop length and
findWeightBasedHop(){ of about 5\% in average end-to-end delay over DSR
Find all possible next hops that will lead to the (Fig.7, Fig.8).
destination from this node;
foreach possible next hop nh
timeInterval = currentInterval;

Packet Delivery Ratio (%)


weightedSum = 0; 120
weight = maxWeight; 100
totalWeight = 0; 80
DSR
while timeInterval >= 0 do 60
if nh is used as a nexthop in TIME_WT
40
timeInterval then 20
weightedSum = weightedSum + 0
weight; 5 10 20 30 40
end
Number of Connections
timeInterval = timeInterval -1;
//previous timeInterval
totalWeight = totalWeight + weight;
weight = weight -1; Figure 5: Packet delivery ratio vs. number of
end connections for DSR and TIME_WT
ratio[nh] = weightedSum / totalWeight;
end
Find the nexthop nh for which the value of ratio
6
Normalized control

is the maximum and return


} 5
overhead

4
DSR
3
TIME_WT
The ratio of the weighted sum of the usage 2
of the node to the total weight is found. The node for 1
which the ratio is greater than a threshold $\beta'$ is 0
chosen as the next hop. 5 10 20 30 40
Number of connections
4.2 Performance Evaluation

The parameters used are the same as those


given in section 3.3. The size of the time interval is Figure 6: Normalized control overhead vs. number of
taken as 40 seconds. The value of $\beta'$ used is connections for DSR and TIME_WT
0.5.

The proposed protocol used in this section 16


Average hop count

is referred to as TIME_WT. The packet delivery 14


ratios for DSR and TIME_WT are nearly similar for 12
5 and 10 connections. When the number of 10 DSR
connections is increased from 20 to 40 there is a 8
6 TIME_WT
slight improvement in the packet delivery ratio from 4
5% to 7% (Fig. 5). 2
0
The normalized control overhead for 5 10 20 30 40
TIME_WT is more than that for DSR when the Number of connections
number of sources is 5 and 10. With increase in the
number of connections from 20 to 40, there is an
average improvement of 14% to 22% over DSR
(Fig.6).
Figure 7: Average hop count vs. number of
connections for DSR and TIME_WT

UbiCC Journal - Volume 3 37


Average end-to-end delay The routing protocol is similar to that of the
7 previous section. The next hop is chosen using the
6 notion of threshold $\beta$ ($\beta$--positive
5 regions) as described in algorithm
4 DSR
findVPRSWeightBasedHop().
3 TIME_WT
2
1
5.2 Performance Evaluation
0
The parameters used are the same as that
5 10 20 30 40
given in section 3.3. The size of the time interval is
Number of Connections taken as 40 seconds. The value of $\beta$, $\beta'$
used are 0.6,0.5 respectively.

Figure 8: Average end-to-end delay vs. number of The proposed protocol used in this section
connections for DSR and TIME_WT is referred to as VPRS_WT. The packet delivery
ratio for VPRS_WT is less than that for DSR for 5
and 10 connections. When the number of
5 MOBILE AD HOC ROUTING USING connections is increased from 20 to 40 there is a
BETA-POSITIVE REGIONS IN WTIS slight improvement in the packet delivery ratio from
2% to 6% (Fig. 9).
5.1 Routing Based on Beta-Positive Regions
The normalized control overhead for
The experiment described in this section VPRS_WT is more than that for DSR when the
determines the predicted next hop as that value of the number of sources is 5, 10. With increase in the
decision attribute where the union of these number of connections from 20 to 40, there is an
elementary sets is in the $\beta$-positive region, as average improvement of 14% to 19% over DSR (Fig.
described in section 2.5. 10).

findVPRSWeightBasedHop(){ The average hop length and the average


Find all possible next hops that will lead to the end-to-end delay for VPRS_WT is more than that for
destination from this node; DSR when the number of sources is 5, 10. But when
Foreach possible next hop nh do the number of connections is increased from 20 to
TimeInterval = currentInterval - 1; 40, it is seen that there is a slight improvement of
weightedSum = 0; about 4% in average hop length over DSR (Fig. 11).
weight = maxWeight;
totalWeight = 0; The average end-to-end delay for
while timeInterval >= currentInterval – k VPRS_WT is similar to that of DSR when the
do number of sources is 5. But when the number of
nhopCount = the number of routes with connections is increased from 10 to 40, it is seen that
next hop nh and willlead to the destination there is an improvement of about 6% in average hop
in timeInterval ; length over DSR (Fig. 12).
totalCount = the number of routes that will
lead to the destination in timeInterval
ratio1 = nhopCount/totalCount
Packet Delivery Ratio(%)

if ratio1 > $\beta$ then 120


weightedSum = weightedSum + 100
weight; 80
DSR
end 60
timeInterval = timeInterval -1; //previous VPRS_WT
40
timeInterval 20
totalWeight = totalWeight + weight;
0
weight = weight -1;
5 10 20 30 40
end
ratio[nh] = weightedSum / totalWeight; Number of Connections
end
Find the nexthop nh for which the value of ratio is
greater than $\beta'$ Figure 9: Packet delivery ratio vs. number of
} connections for DSR and VPRS_WT

UbiCC Journal - Volume 3 38


Using recent routes (DSRrecent) was found to
Normalized control 6 improve packet delivery ratio and normalized control
5 overhead. Temporal information was brought into
overhead

4 information systems. Recent elementary sets were


DSR
3 given more importance in the two proposed methods,
VPRS_WT TIME_WT and VPRS_WT. The VPRS_WT method
2
1 uses notions from VPRS. It was seen that the control
0
overhead is much better, while the packet delivery
ratio, average hop length and average end-to-end
5 10 20 30 40
delay are slightly better than that of DSR. It was also
Number of connections seen that the improvement in performance increases
with the number of connections.

Figure 10: Normalized control overhead vs. number 7 REFERENCES


of connections for DSR and VPRS_WT
[1] A. Skowron and P. Synak. Patterns in
Information Maps. In Rough Sets and Current
Trends in Computing, volume 2475 of Lecture Notes
16 in Artificial Intelligence, pages 453 – 460. Springer,
Average hop count

14
12
002.
10 DSR
[2] A. Skowron and P. Synak. Reasoning Based on
8 Information Changes in Information Maps. In Rough
6 VPRS_WT
Sets, Fuzzy Sets, Data Mining, and Granular
4
Computing, volume 2639 of Lecture Notes in
2
0 Artificial Intelligence, pages 229 – 236. Springer
2003.
5 10 20 30 40
[3] A. Wieczorkowska, J. Wroblewski, P. Synak and
Number of connections D. Slezak. Application of temporal descriptors to
musical sound recognition. Journal of Intelligent
Information Systems, 21(1):71 – 93, 2003.
Figure 11: Average hop count vs. number of [4] J.K. Baltzersen. An attempt to predict stock
connections for DSR and VPRS_WT market data: a rough sets approach. Master’s thesis.
1996.
[5] David B.Johnson and David A. Maltz. Dynamic
source routing in ad hoc wireless networks. In
Average end-to-end delay

7
Imielinski and Korth, editors, Mobile Computing,
6
volume 353. Kluwer Academic Publishers, 1996.
5
4 DSR [6] Zhongmin Cai, Xiaohong Guan, Ping Shaoa, and
3
Guoji Sun. A rough set theory based method for
VPRS_WT
2 anomaly intrusion detection in computer network
1 systems. In Expert Systems, volume 20, pages 251 –
0 259, 2003.
5 10 20 30 40 [7] Frank Chiang and Robin Braun. Intelligent
failure domain prediction in complex
Number of Connections
telecommunication networks with hybrid rough sets
and adaptive neural nets. In 3rd International
Information and Telecommunication Technologies
Figure 12: Average end-to-end delay vs. number of Symposium, 2004.
connections for DSR and VPRS_WT [8] Slezak D., Synak P., Wieczorkowska A., and
Wroblewski J. Kdd-based approach to musical
instrument sound recognition. In Proc. of the 13th
6 CONCLUSIONS International Symposium on Foundations of
Intelligent Systems, Vol. 2366, Lecture Notes in
This paper presents temporal extensions to Artificial Intelligence, Springer, pages 28 – 36, 2002.
Rough Set Theory and Variable Precision Rough [9] Rohit Dube, Cynthia D. Rais, Kuang-Yeh Wang,
Sets. These extensions are applied to Mobile Ad hoc and Satish K. Tripathi. Signal stability-based
routing. Illustrative experiments are described and adaptive routing (SSA) for ad hoc mobile networks.
the results are presented. IEEE Personal Communications, 4(1):36{45, 1997.
[10] K. Fall and K. Varadhan. The ns manual

UbiCC Journal - Volume 3 39


(formerly ns notes and documentation), 2002. Hengshan Geng. Application of rough set theory in
http://www.isi.edu/nsnam/ns/doc/index.html. network fault diagnosis. In Proceedings of the Third
[11] Roed G. Knowledge extraction from process International Conference on Information Technology
data: A rough set approach to data mining on time and Applications (ICITA 05), pages 556 – 559 vol. 2.
series. In Mater's thesis, 1999. [25] C. Perkins and E. Royer. Ad-hoc on demand
[12] Wang Haijun and Chen Yimin. Sensor data distance-vector routing for mobile computers. In
fusion using rough set for mobile robots system. In Proceedings of the Second international workshop on
Proceedings of the 2nd IEEE/ASME International Mobile Computing Systems and applications, pages
Conference on Mechatronic and Embedded Systems 90 - 100, 1999.
and Applications, pages 1 – 5, 2006. [26] V. Mary Anita Rajam, V. Uma Maheswari, and
[13] Xiaowei He, Liming Xu, and Wenzhong Shen. Arul Siromoney. Mobile ad hoc routing using rough
Dynamic information system and its rough set model set theory. In 2006 International Conference on
based on time sequence. In Proc. of 2006 IEEE Hybrid Information Technology -Vol2 (ICHIT'06),
International Conference on Granular Computing, pages 80 - 83, November 2006.
pages 542 – 545, 2006. [27] V. Mary Anita Rajam, V. Uma Maheswari, and
[14] Peng Hong, Dongna Zhang, and Tiefeng Wu. Arul Siromoney. Extensions in mobile ad hoc
An intrusion detection method based on rough set routing using variable precision rough sets. In IEEE
and svm algorithm. In 2004 International Conference International Conference on Granular Computing,
on Communications, Circuits and Systems, ICCCAS pages 237 – 240, November 2007.
2004, pages 1127 – 1130 vol. 2. [28] Lin S., Chen S., and Ning Z. Several techniques
[15] Herbert J. and Yao J. T. Time-series data usable in translating a TIS into IS. 30(5), 2003.
analysis with rough sets. In Proc. of 4th International [29] Bjorvand A. T. Mining time series using rough
Conference on Computational Intelligence in sets -a case study. In Komorowski H. J. and Zytkow
Economics and Finance, pages 908 – 911, 2005. J. M., editors, Principles of Data Mining and
[16] Li J., Xia G., and Shi X. Association rules Knowledge Discovery, Vol. 1263, Lecture Notes in
mining from time series based on rough set. In Proc. Computer Science, Springer, pages 351 - 358, 1997.
of the Sixth International Conference on Intelligent [30] Kowalczyk W. and Slisser F. Analyzing
Systems Design and Applications, pages 509 – 516, customer retention with rough data models. In
2006. Komorowski H. J. and Zytkow J. M., editors,
[17] J. Komorowski, Z. Pawlak, L. Polkowski, and A. Principles of Data Mining and Knowledge Discovery,
Skowron. Rough sets: A tutorial. In S. K. Pal and A. Vol. 1263, Lecture Notes in Computer Science,
Skowron, editors, Rough Fuzzy Hybridization: A Springer, pages 4 – 13, 1997.
New Trend in Decision-Making, pages 3-98. [31] W.Ziarko. Variable precision rough set model.
Springer-Verlag, 1999. Journal of Computer and Systems Sciences, 46(1):39
[18] Shen L. and Loh H. T. Applying rough sets to – 59, 1993.
market timing decisions. Decision Support Systems, [32] Wang Xuren, He Famei, and Xu Rongsheng.
Special Issue: Data Mining for Financial Decision Modeling intrusion detection system by discovering
Making, 37(4):583 – 597, 2004. association rule in rough set theory framework. In
[19] Synak P. Temporal templates and analysis of Proceedings of the International Conference on
time related data. In Rough Sets and Current Trends Computational Intelligence for Modelling Control
in Computing, volume 2005 of Lecture Notes in and Automation and International Conference on
Computer Science, pages 420 – 427. Springer, 2000. Intelligent Agents, Web Technologies and Internet
[20] Synak P. Temporal feature extraction from Commerce, page 24, 2006.
temporal information systems. In Ning Zhong, [33] J. Zhang and R. D McLeod. A udp-based file
Zbigniew W. Ras, Shusaku Tsumoto, Einoshin transfer protocol (uftp) with flow control using a
Suzuki, editors, Foundations of Intelligent Systems, rough set approach. 2005.
14th International Symposium, ISMIS 2003, Vol. [34] W. Ziarko. Set approximation quality measures
2871, Lecture Notes in Computer Science, Springer, in the variable precision rough set model. In Proc. of
pages 270 – 278, 2003. 2nd Intl. Conference on Hybrid Intelligent Systems,
[21] Z. Pawlak. Rough sets. International Journal of Santiago, Chile, 2002.
Computer and Information Sciences, 11(5):341– 356,
1982.
[22] Z. Pawlak. Rough Sets — Theoretical Aspects
of Reasoning about Data. Kluwer Academic
Publishers, Dordrecht, The Netherlands, 1991.
[23] Zdzislaw Pawlak. Rough set theory and its
applications. In Journal of Telecommunications and
Information Technology, 2002.
[24] Yuqing Peng, Gengqian Liu, Tao Lin, and

UbiCC Journal - Volume 3 40


INTEGRATION OF FUZZY INFERENCE ENGINE WITH RADIAL
BASIS FUNCTION NEURAL NETWORK FOR SHORT TERM LOAD
FORECASTING

Ajay Shekhar Pandey , S.K.Sinha


Kamla Nehru Institute of Technology ,Sultanpur, UP, INDIA
[email protected] , [email protected]

D. Singh
Institute of Technology,Banaras Hindu University,Varanasi, UP, INDIA

ABSTRACT

This paper proposes a fuzzy inference based neural network for the forecasting of
short term loads. The forecasting model is the integration of fuzzy inference engine
and the neural network, known as Fuzzy Inference Neural Network (FINN). A
FINN initially creates a rule base from existing historical load data. The parameters
of the rule base are then tuned through a training process, so that the output of the
FINN adequately matches the available historical load data. Results show that the
FINN can forecast future loads with an accuracy comparable to that of neural
networks, while its training is much faster than that of neural networks. Simulation
results indicate that hybrid fuzzy neural network is one of the best candidates for
the analysis and forecasting of electricity demand. Radial Basis Function Neural
Network (RBFNN) integrated with Fuzzy Inference Engine has been used to create
a Short Term Load Forecasting model.

Keywords: STLF, RBFNN ,Fuzzy Inference, Fuzzy Inference Neural Networks.

1 INTRODUCTION conditions and the short time required for their


development, have made ANN based STLF models a
Short term forecasts in particular have become very attractive alternative for on line implementation
increasingly important since the rise of the in energy control centers. In this era of competitive
competitive market. Forecasting the power demand power market, it is of main concern that how to
is an important task in power utility companies improve accuracy of STLF.
because accurate load forecasting results in an In recent years use of intelligent techniques have
economic, reliable and secure power system increased noticeably. ANN and fuzzy systems are
operation and planning. Short Term Load two powerful tools that can be used in prediction and
Forecasting (STLF) is important for optimum modeling. Load forecasting techniques such as ANN
operation planning of power generation facilities, as [4], [5], [6], [7], [11], [15] , [18], Expert systems [14],
it affects both system reliability and fuel fuzzy logic, fuzzy inference [2], [3], [10], [12], [13],
consumption. The complex dependence of load on [16] have been developed, showing more accurate
human behaviour, social and special events & and acceptable results as compared to conventional
various environmental factors make load forecasting methods. A wide variety of conventional models for
a tedious job. It is an important function performed STLF have also been reported in the literature. They
by utilities for planning operation and control and is are based on various statistical methods such as
primarily used for economic load dispatch, daily regression [1], Box Jenkins models [9] and
operation and control, system security and assurance exponential smoothing [19]. Conventional ANN
of reliable power supply. The impacts of model based STLF have several drawbacks, such as
globalization and deregulation demands improved long training time and slow convergent speed. The
quality at competitive prices, which is the reason RBF model is a very simple and yet intrinsically
why development of advanced tools and methods for powerfully network, which is widely used in many
planning, analysis, operation and control are needed. fields because of its extensive learning and highly
Important decisions depend on load forecast with computing speed [6],[7]. A neuro-fuzzy approach
lead times of minutes to months. The ability of ANN has been applied successfully in a price sensitive
to outperform the traditional STLF methods, environment [2]. Soft Computing (SC) introduced by
especially during rapidly changing weather Lotfi Zadeh [20] is an innovative approach to

UbiCC Journal - Volume 3 41


construct computationally intelligent hybrid systems unit space is non-linear whereas the transformation
consisting of Artificial Neural Network (ANN), from the hidden unit space to the output space is
Fuzzy Logic (FL), approximate reasoning and linear. The basis functions in the hidden layer
optimization methods. produce a localized response to the input i.e.
Fuzzy system is another research area which is each hidden unit has a localized receptive field.
receiving increased attention. The pioneering work RBFNNs exhibit a good approximation and learning
of Zadeh in fuzzy set theory has inspired work in ability and are easier to train and generally converge
many research areas with excellent results. A fuzzy very fast. It uses a linear transfer function for the
expert system for STLF is developed in [15]. It uses output units and Gaussian function (radial basis
fuzzy set theory to model imprecision in the load function) for the hidden units. The transform
temperature model and temperature forecasts as well function of hidden layer is a non-negative and
as operator’s heuristic rules. Fuzzy set theory nonlinear function. In RBF neural network, three
proposed by Zadeh [20] provides a general way to parameters are needed to study: the center and the
deal with uncertainty, and express the subjective variance of the basis function and the weight
knowledge about a process in the form of linguistic connecting hidden layer to the output layer. The RBF
IF-THEN rules. network has many study methods according to the
Fuzzy Systems exhibit complementary different methods of selecting the center. In this
characteristics, offering a very powerful framework paper, a method of the self-organizing study
for approximate reasoning as it attempts to model the selecting RBF center is adopted. The method
human reasoning process at a cognitive level. It consists of two-step procedure: the first one is self-
acquires knowledge from domain experts and this is organizing study, which is to study the basis function
encoded within the algorithm in terms of the set of center and variance; then the next step is supervisory
If-Then rules. Fuzzy systems employ this rule based study, which is the weight connecting hidden layer to
approach and interpolative reasoning to respond to the output layer. A RBF neural network embodies
new inputs. Fuzzy systems are suitable for dealing both the features of an unsupervised learning based
with problems caused by uncertainty, inexactitude classification and a supervised learning layer. The
and noise, so the uniting of fuzzy system and neural network is mainly a feed forward neural network.
networks can exert respective advantages. The hidden unit consists of a function called the
In this paper, a fuzzy inference neural network is radial basis function, which is similar to the Gaussian
presented to improve the performance of STLF in Density function whose output is given by
electric power systems. A Fuzzy Inference Neural
Network initially creates a fuzzy rule base from
existing historical load data. The parameters of the ⎛
⎜ r (x − W ) 2 ⎞⎟
rule base are then tuned through a training process so jp ij (1)
o = exp − ⎜ ∑ ⎟
that the output of the network adequately matches i ⎜ j = 1 σ ⎟
the available historical load data. The fuzzy system ⎝ ⎠
combines the fuzzy inference principles with neural
network structure and the learning ability into an where,
integrated neural network based fuzzy decision Wij = Center of the i th RBF unit for input variable j
system. Combining the specific characteristic that the
variety of power systems load is non-linear, we set σ = Spread of the RBF unit
up a new short-term load forecasting model based on x = j th variable of the input pattern
fuzzy neural networks and fuzzy getting smaller
inference algorithm. The flexibility of the fuzzy logic The RBF neural network generalizes on the
approach, offering a logical set of IF-THEN rules, basis of pattern matching. The different patterns are
which could be easily understood by an operator, stored in a network in form of cluster centers of the
might be a good solution for easy practical neurons of the hidden units. The number of neuron,
implementation and usage of STLF models. The determines the number of cluster centers that are
hybrid FNN approach is finally used to forecast stored in the network. The response of particular
loads with greater accuracy than the conventional hidden layer node is maximum (i.e. 1) when the
approaches when used on a stand- alone mode. incoming pattern matches the cluster center of the
neuron perfectly and the response decays
monotonically as the input patterns mismatches the
2 RADIAL BASIS FUNCTION NEURAL
cluster center; the rate of decay can be small or large
NETWORK
depending on the value of the spread. Neurons with
large spread will generalize more, as it will be giving
Radial Basis Function (RBF) Network consists
same responses (closer to 1) even for the wide
of two layers, a hidden layer with nonlinear neurons
variation in the input pattern and the cluster centers
and an output layer with linear neurons. Thus, the
whereas a small spread will reduce the generalization
transformation from the input space to the hidden
property and work as a memory. Therefore, spread is

UbiCC Journal - Volume 3 42


an important parameter and depends on the nature of range. There are two types of fuzzy models. The first
input pattern space. kind is known as Mamdani model [8]. In this model,
The output linear layer simply acts as an optimal both fuzzy premise part and consequence part are
combiner of the hidden layer neuron responses. The represented in linguistic terms. The other kind is
weights ‘w’ for this layer are found by multiple Takagi-Sugeno model [17] that uses linguistic term
linear regression technique. The output of the linear only for the fuzzy premise part. In this paper the
layer is given by Takagi-Sugeno reasoning method is used.
The fuzzification interface is a mapping from the
observed non-fuzzy input space U ⊆ R to the
N n

y mp = ∑i =1
w mi oi + bi (2) fuzzy sets defined in U. Hence, the fuzzification
interface provides a link between the non-fuzzy
where, outside world and the fuzzy system framework. The
N = number of hidden layer nodes (RBF units) fuzzy rule base is a set of linguistic rules or
conditional statements in the form of: "IF a set of
y mp = output val ue of the m th node in the output layer
conditions is satisfied, THEN a set of consequences
for the i th incoming pattern are inferred". The fuzzy inference engine is a
decision making logic performing the inference
w mi = weight betweeni th RBF unitandmth outputnode
operations of the fuzzy rules. Based on the fuzzy IF-
THEN rules in the fuzzy rule base and the
bl = biasing strength of the m th output node
compositional rule of inference [14], the appropriate
oi = i th input to the linear layer. fuzzy sets are inferred in the output space.
Supposing the mapping µ A from discussed
The values of the different parameters of the
RBF networks are determined during training. These region U to the range [0, 1]: U → [0,1] ,
parameters are spread, cluster centers, and weights x → µ A (x) confirms a fuzzy subset of U, named A,
and biases of the linear layer. The number of neurons
for the network and spread is determined through the mapping µ A (x) is known as membership
experimentation with a large number of function of A. The size of the mapping
combinations of spread and number of neuron. The µ A (x) shows the membership degree of x to fuzzy
best combination is one which produces minimum
Sum Squared Error (SSE) on the testing data. set A, which is called membership degree for short.
In practice, membership function can be selected
3 FUZZY INFERENCE according to the characteristic of the object.
Fuzzy inference based on fuzzy estimation is a
Fuzzy inference is the process of formulating the method by which a new and approximate fuzzy
mapping from a given input to the output using fuzzy estimation conclusion is inferred using fuzzy
logic. This process numerically evaluates the language rule. This paper adopts composite fuzzy
information embedded in the fuzzy rule base. The inference method, which is inference method based
fuzzy rule base consists of “IF-THEN” type rules. on fuzzy relation composing principle. A fuzzy
For a set of input variables, there will be fuzzy inference engine can process mixed data. Input data
membership in several fuzzy input variables. By received from the external world is analyzed for its
using the fuzzy inference mechanism, the validity before it is propagated into a fuzzy inference
information is processed to evaluate the actual value engine. The capability of processing mixed data is
from the fuzzy rule base. A good precision can be based on the membership function concept by which
achieved by applying appropriate membership all the input data are eventually transformed into the
definitions along with well-defined membership same unit before the inference computations. A
functions. This is an information processing system fuzzy inference engine normally includes several
that draws conclusions based on given conditions or antecedent fuzzy variables. If the number of
evidences. A fuzzy inference engine is an inference antecedent variables is k then there will be k
engine using fuzzy variables. Fuzzy inference refers information collected from the external world.
to a fuzzy IF-THEN structure. The fact that fuzzy Fuzzification and normalization are the two typical
inference engines evaluates all the rules transformations. Another important property is that
simultaneously and do not search for matching when an input data set is partially ambiguous or
antecedents on a decision tree makes them perfect unacceptable, a fuzzy inference engine may still
candidates for parallel processing computers. produce reasonable answers.
A fuzzy set is a set without a crisp, clearly
defined boundary, and can contain fuzzy variables 4 FUZZY INFERENCE NEURAL NETWORK
with a partial degree of membership, which is
presented by the membership functions within the A fuzzy Inference neural network approach,

UbiCC Journal - Volume 3 43


which combines the important features of ANN and Temperature is the most effective weather
fuzzy using inference mechanism is proposed .This information on hourly load. Data has been taken for
architecture is suggested for realizing cascaded fuzzy Trans Alta Canada System.
inference system and neural network modules, which In order to make minimum inference case, the
are used as building blocks for constructing a load input load is sorted into 5 categories and labeled as
forecasting system. The fuzzy membership values of low (L), low medium (LM), medium (M), medium
load and temperature are the inputs to the ANN, and high (MH) and high (H). The input temperature is
the output comprises the membership value of the also sorted into 5 categories same as above. Design
predicted load. To deal with the linguistic values data consists of hourly data, integrated load data and
such as high, low, and medium, architecture of ANN temperature of two places. Keeping in view the large
that can handle fuzzy input vectors is propounded. geographical spread of the data , for which the utility
Each input variable is converted into a fuzzy supply, the hourly temperature of two places have
membership function in the range [0-1] that been taken in the historical data. Firstly data are
corresponds to the degree to which the input belongs normalized. The n rows thus give for each group the
to a linguistic class. RBFNN has been integrated value of m feature denoting the characteristics of
with fuzzy inference to form a FINN for Short Term these groups. In the present work features correspond
Load Forecasting. The RBFNN is used to extract the to characterization of data model i.e. hrs., two hours
features of input and output variables. It is before load, one hour before load, temp.1, temp.2 In
noteworthy that the input variables are extended to this paper, fuzzy IF-THEN rules of the form
include a output variable and extract the relationship suggested by Takagi- Sugeno [19] are employed,
between inputs. where fuzzy sets are involved only in the premise
part of the rules while the consequent part is
4.1 Input Variable Selection and Data Processing described by a non-fuzzy function of the input
The most important work in building our Short variables. The historical data is used to design data
Term Load Forecasting (STLF) models is the which are further fuzzified using IF-THEN rule.
selection of the input variables. It mainly depends on
The data model involves the range of data low
experience and is carried out almost entirely by trial
(L), low medium (LM), medium (M), medium high
and error. However, some statistical analysis can be
(MH) and high (H), five linguistic variables for each
very helpful in determining the variables, which have
crisp data type. These five linguistic value are
significant influence on the system load. Normally
defined as L(3800 MW-4200 MW), LM
more input neurons make the performance of the
(4280.001MW- 4760 MW), M(4760.001 MW-
neural network worse in many circumstances.
5240 MW), MH (5240.001 MW -5720 MW) and
Optimal input parameters would result in a compact
H(5720.001 MW-6200 MW)and the linguistic values
ANN with higher accuracy and also at the same time
for temperature are as L (-370°C to -230 °C), LM
with good convergence speed. Parameters with effect
(-229.999°Cto -90°C), M(-89.999°C to +50°C),
on hourly load can be categorized into day type,
MH (+50.001°C to +190 °C) and H (+190.001°C to
historical load data and weather information.

Input Variables Neural Network

Fuzzy Radial Basis


Output
Inference Function Neural
Engine Network

Learning
Algorithm

Figure 1: Forecasting Model

UbiCC Journal - Volume 3 44


+330°C), using IF-THEN rule. These data are Data Set
normalized and fuzzified using inference engine as
shown in demand table (Table-1). The five linguistic
variables using IF-THEN rule for load as well as
temperature are as follows. Input Data Set
If P1 is low (L) and P2 is low (L) then α=LL (Load1, Load2, Temp1, Temp2, Hr-Load)
If P1 is low (L) and P2 is low medium (LM) then
α=LLM
If P1 is low (L) and P2 is medium (LM) then α=LM Making of Rule Base
If P1 is low (L) and P2 is medium high (MH) then
α=LMH
If P1 is low (L) and P2 is high (H) then α=LH
If P1 is low medium (LM) and P2 is low (L) then
α=LML Categorization and Distribution of
If P1 is low medium (LM) and P2 is low medium Data Set
(LM) then α=LMLM and so on.

4.2 Forecasting Model


In FINN the RBFNN plays an important role to Data Conversion
classify input data into some clusters while the fuzzy Normalization of Data
inference engine handles the extraction of rules. Fig.
1 shows the structure of FINN that has two layers;
input/output and rule layers. The input/output layer
has input and output node. The input nodes of the Crisp set of data
input/output layer are connected to neurons on the
topological map of the rule layer. The fuzzy
membership neural networks are assigned to the Fuzzification (Fuzzified input data)
weight between the input nodes and rule layer. Also,
the consequent constant is assigned between the
output node and rule layer. The parameter selection
method can be considered as a rule base initialization Training and Testing through RBFNN
process. Essentially, it performs a fuzzification of the
selected input points within the premise space. The
mean values of the memberships are centered Forecasting
directly at these points, while the membership
deviations reflect the degree of fuzzification and are
selected in such a way that a prescribed degree of Actual Data
overlapping exists between successive memberships.
The fact that the initial parameters of the FINN are
not randomly chosen as in neural networks but are
assigned reasonable values with physical meaning Mean Absolute Percentage Error
gives the training of an FNN a drastic speed
advantage over neural networks.
With fusing the strongpoint of fuzzy logic and Figure 2: Flow chart of Forecasting Process
neural networks, a fuzzy inference neural networks
model, which effectively makes use of their Table 1: Demand table
advantages, has been developed. The training
patterns for the ANN models are obtained from the
historical loads by classifying the load patterns
according to the day-types of the special days and
linearly scaling the load values. The block diagram
of the proposed system and the flow chart of the
forecasting process are shown in the Fig.1. and Fig.2.

5 SIMULATION RESULTS

The most widely used index for testing the


performance of forecasters is the MAPE. The

UbiCC Journal - Volume 3 45


Table 2: Forecast errors in MAPE on seasonal transition weeks

Winter Spring Summer Average


January 25-31 May 17-23 July 19-25
Day Day Week Day Week Day Week Day Week
Ahead Ahead Ahead Ahead Ahead Ahead Ahead Ahead
Monday 2.5711 2.5711 1.9990 1.9990 2.2050 2.2050 2.2584 2.2584
Tuesday 1.6763 1.5041 1.8121 1.8797 2.0467 1.9221 1.8450 1.7686
Wednesday 2.0342 2.0527 2.0369 1.9750 2.4277 1.9505 2.1663 1.9927
Thursday 2.4767 2.6438 2.2687 2.0208 1.5584 1.5206 2.1013 2.0617
Friday 2.9492 1.9225 1.8399 1.8356 1.5065 1.5079 2.0985 1.7553
Saturday 2.4953 2.3185 2.4913 2.3826 1.9120 1.9915 2.2995 2.2309
Sunday 2.7416 2.8998 2.6638 2.6110 1.6234 1.5122 2.3429 2.3410
Average 2.4206 2.2732 2.1588 2.1005 1.8971 1.8014 2.1588 2.0584

Table 3: Comparison with MLR and simple RBFNN

Winter Spring Summer


Day January 25-31 May 17-23 July 19-25
MLR RBFNN FINN MLR RBFNN FINN MLR RBFNN FINN
Monday 2.3863 1.0776 2.5711 2.7664 1.0856 1.9990 2.8015 1.2466 2.2050
Tuesday 1.6070 1.0727 1.5041 2.8966 0.7082 1.8797 2.2284 2.2017 1.9221
Wednesday 2.2656 1.1105 2.0527 3.3757 0.9606 1.9750 2.6688 0.8057 1.9505
Thursday 1.8675 0.7494 2.6438 2.3315 2.2876 2.0208 3.0628 1.2365 1.5206
Friday 1.6801 1.1171 1.9225 2.9397 1.1114 1.8356 2.6345 0.9062 1.5079
Saturday 2.8921 1.6459 2.3185 1.0263 0.7726 2.3826 2.4133 1.0312 1.9915
Sunday 2.3560 1.5838 2.8998 2.2336 1.7412 2.6110 2.1984 1.1475 1.5122
Average 2.3228 1.1939 2.2732 2.5100 1.2310 2.1005 2.5725 1.2246 1.8014

6000
A c tual
5800 Forec as ted

5600

5400
(MW
)

5200
oad

5000
L

4800

4600

4400

4200
0 20 40 60 80 100 120 140 160
Hour

Figure 3: Forecast for Winter (January 25-31)


5600

5400 Actual
Forecasted
5200

5000
(M
d W
)

4800
o
La

4600

4400

4200

4000
0 20 40 60 80 100 120 140 160
Hour

Figure 4: Forecast for Summer (July 19-25)

UbiCC Journal - Volume 3 46


5600

5400 Actual
Forecasted

5200

5000
(MW
)

4800
oad

4600
L

4400

4200

4000

3800
0 20 40 60 80 100 120 140 160
Hour

Figure 5: Forecast for Spring (May 17-23)

designed network is used to forecast the day ahead accurate as compared to MLR. The error depends
and week ahead forecast on an hourly basis. on many factors such as homogeneity in data,
Forecasting has been done on the one year load data network parameters, choice of model and the type
of Trans Alta Electric Utility for Alberta, Canada. of solution. The flexibility of the fuzzy logic
Load varies from 3900 MW to 6200MW. The offering a logical set of IF-THEN rules, which
FINN is trained using last four weeks hourly load could be easily understood by an operator, will be a
data and then they are used to forecast the load for good solution for practical implementation. FINN
the next 168 hours i.e. one week. The results are training time was much faster and also effectively
reported for three weeks, one each for winter, incorporated linguistic IF-THEN rules. Load
spring and summer seasons. This reflects the forecasting results show that FINN is equally good
behaviour of the network during seasonal changes for week ahead and day ahead forecasting and
and corresponding results are shown in Table 2. It requires lesser training time as compared to other
is observed that the performance of the day ahead forecasting techniques, conventional regression
and week ahead forecast are equally good. Load MLR and simple RBF neural network.
shape curves for three weeks are shown in Fig. 3,
Fig. 4 and Fig. 5.The errors are tabulated in Table 2. ACKNOWLEDGEMENT
It is observed from the figures that the forecaster
captures the load shape quite accurately and the The authors would like to thank TransAlta,
forecasting error on most of the week days are low Alberta, Canada for providing the load data used in
with slightly higher error on weekend days. the studies.
For having a comparative study the proposed
FINN method is compared with other two methods, 7 REFERENCES
conventional Multi Layer Regression and RBF
neural networks for the same period of time. The [1.] A.D.Papalexopoulos, T.Hasterberg: A
result (Table 3) shows that the average MAPE for Regression based Approach to Short Term
FINN is better than MLR in all seasons and the System Load Forecast , IEEE Trans. On
average MAPE for RBFNN is even better than Power Systems. Vol.5, No.4, pp 1535-1544,
FINN. But at the same time it is also noticeable that (1990).
the training time required in the forecasting through
RBFNN integrated with Fuzzy Inference is [2.] A. Khotanzad, E. Zhou and H.Elragal: A
approximately ten times less than the training time Neuro-Fuzzy approach to Short-term load
required for simple RBFNN. forecasting in a price sensitive environment,
IEEE Trans. Power Syst., vol. 17 no. 4, pp.
6 CONCLUSION 1273–1282, (2002).

The benefit of the proposed structure is to


utilize the advantages of both, i.e. the [3.] A. G. Bakirtzis, J. B. Theocharis, S. J.
generalization capability of ANN and the ability of Kiartzis, and K. J. Satsios: Short-term load
fuzzy inference for handling uncertain problems forecasting using fuzzy neural networks,
and formalizing the experience and knowledge of IEEE Trans. Power Syst., vol. 10, pp.
the forecasters. Load forecasting method proposed 1518–1524,(1995).
above is feasible and effective. A comparative
study shows that FINN and RBFNN are more

UbiCC Journal - Volume 3 47


[4.] C.N. Lu, H.T. Wu and S. Vemuri: Neural Load Forecasting System Using Artificial
Network Based Short Term Load Neural Networks and Fuzzy Expert
Forecasting , IEEE Transactions on Power Systems, IEEE Trans. on Power Systems,
Systems, Vol. 8, No 1, pp. 336-342, (1993). vol. 10, no. 3, pp. 1534–1539, ( 1995).
PAS-101, pp. 71-78. (1982)
[14.] K.L.Ho, Y.Y.Hsu, C.F.Chen, T.E.Lee,
[5.] D.C.Park M.A.,El-Sharkawi, R.J.Marks, C.C.Liang, T.S.Lai and K.K.Chen : Short
L.E.Atlas and M.J.Damborg: Electric Load Term Load Forecasting of Taiwan Power
Forecasting using an Artificial Neural System using a Knowledge Based Expert
Networks , IEEE Trans. on Power Systems, System, IEEE Trans.on Power Systems,
vol.6,No.2, pp. 442-449,(1991). vol.5, pp. 1214-1221, (1990).

[6.] D.K.Ranaweera, .F.Hubele and [15.] K.Y. Lee, Y.T. Cha, and J.H. Park: Short-
A.D.Papalexopoulos: Application of Radial Term Load Forecasting Using An Artificial
Basis Function Neural Network Model for Neural Network,” IEEE Trans. on Power
Short Term Load Forecasting , IEE Proc. Systems, vol. 7, no. 1, pp. 124–132, (1992).
Gener. Trans. Distrib., vol. 142, No.1,
(1995).
[16.] Ranaweera D.K., Hubele N.F. and Karady
G.G: Fuzzy logic for short-term load
[7.] D. Singh and S.P. Singh: Self selecting
forecasting, Electrical Power and Energy
neural network for short-term load
Systems,” Vol. 18, No. 4, pp. 215-222,
forecasting , Jour. Of Electric Power.
(1996).
Component and Systems, vol. 29, pp 117-
130, (2001).
[17.] T. Takagi and M. Sugeno: Fuzzy
[8.] E. H. Mamdani and S. Assilian: An identification of systems and its
experiment in linguistic synthesis with a applications to modeling and control, IEEE
fuzzy logic controller, Int. J. Man–Mach. Trans. Syst., Man, Cybern., vol. 15, pp.
Stud., vol. 7, no. 1, pp. 1–12, (1975). 116–132, (1985).

[9.] F. Meslier: New advances in short term [18.] T. S. Dillon, S. Sestito, and S. Leung: Short
load forecasting using Box and Jenkins term load forecasting using an adaptive
approach , Paper A78 051-5, IEEUES neural network, Elect. Power Energy Syst.,
Winter Meeting,( 1978). vol. 13, (1991).

[10.] Hiroyuki Mori and Hidenori Kobayashi: [19.] W.R.Christiaanse: Short Term Load
Optimal fuzzy inference for short term load Forecasting using general exponential
forecasting, IEEE Trans. on Power Systems, smoothing, IEEE Trans. On Power Appar.
vol.11, No.2, pp. 390-396, (1996). Syst.,PAS-3,pp 900-911 (1988)

[11.] I. Mogram and S. Rahman : Analysis and [20.] Zadeh L.A: Roles of Soft Computing and
evaluation of five short term load forecast Fuzzy Logic in the Conception, Design and
techniques, IEEE Trans. On Power Systems. Deployment of Information /Intelligent
Vol.4, No.4, pp 1484-1491, (1989). Systems, Computational Intelligence, Soft
Computing and Fuzzy-Neuro Integration
[12.] Kwang-Ho Kim, Hyoung-Sun Youn, Yong- with Applications, O Kaynak, LA Zadeh, B
Cheol Kang: Short-tem Load Forecasting Turksen, IJ Rudas (Eds.), pp 1-9. (1998).
for Special Days in anomalous Load
Conditions Using Neural Network and
Fuzzy Inference Method, IEEE Trans. on
Power Systems, Vol. 15, pp. 559-569,
(2000).

[13.] K.H. Kim, J.K. Park, K.J. Hwang, and S.H.


Kim: Implementation of Hybrid Short-term

UbiCC Journal - Volume 3 48


EXPLORING PERFORMANCE LANDSCAPE OF UNSTRUCTURED
SEARCH SCHEMES

Hong Huang and Rajagopal Reddy Manda


Klipsch School of Electrical and Computer Engineering, New Mexico State University, USA
{hhuang, rgopal}@nmsu.edu

ABSTRACT
Search plays an important role in ubiquitous computing. In this paper, we
investigate the expected cost and latency of three unstructured search schemes:
broadcast, TTL-based, and random-walk-based search schemes. We build a unified
analytical model for unstructured search schemes. We demonstrate, through
simulation, that the different search schemes exhibit very different cost and latency
tradeoffs, leaving large gaps in the performance landscape. We propose
randomized mixing schemes to bridge such performance gaps and bring out new
Pareto frontier that offers more diverse performance choice for applications.

Keywords: performance modeling, randomized mixing, search methods.

1 INTRODUCTION target without high cost broadcast. However, a TTL-


based scheme can be wasteful in overlapped
Search for information in a network has many coverage of successive broadcasts and generally
important applications in ubiquitous computing, such causes larger search latency than simple broadcast.
as route discovery of on-demand routing protocols in In a RW-based search, a query packet carries out
wireless ad hoc networks [1], event query in sensor a random walk in the network, which continues until
nets [2][3], information lookup in peer-to-peer the target is found [2][3]. A RW-based scheme has
networks [4], etc. Unstructured search refers to a the finest granular search action possible, i.e.,
type of search where no a priori knowledge about visiting a single node; but it can cause large latency.
the search target is available. Unstructured search is Variations on the basic random walk scheme are
applicable in cases where the network is highly possible to reduce latency, e.g., using multiple
dynamic and maintaining an infrastructure for data walkers.
lookup is too costly [4]. Unstructured search can be There are two main performance metrics for a
and sometimes is implemented by a broadcast search. search scheme [6]: expected cost and expected
However, a broadcast search is very costly and not latency. Expected cost is defined as the expected
scalable. A broadcast search is particularly total number of hops traveled by the query packets
unjustifiable when the target is replicated for generated a particular search scheme. Expected
robustness and latency reduction as is common with latency is defined as the expected time duration, in
today’s distributed applications. As mentioned in [4], the unit of hops (i.e., one hop takes one unit of time),
the fundamental problem of broadcast search lies at between initiation of the search and the discovery of
the lack of granularity of the search action, i.e., either the target. We do not include the time for the result
no action or a very costly one--broadcast. to travel from the target to the originator, because it
To reduce cost and provide for finer granular has nothing to do with the merits of a search scheme.
search actions than broadcast search, a variety of Previous work most closely related to ours
other unstructured search schemes were proposed. includes the following. TTL-based search on a line is
Here we focus on two types of such schemes first treated in [5] where an optimization problem is
appearing in recent literature: iterative broadcast formulated and the competitive ratio to the optimal
(TTL-based) schemes, and random-walk-based (RW- offline algorithm is obtained. In [6], a dynamical
based) schemes. In a TTL-based scheme [5][6], a programming formulism for TTL-based scheme is
series of broadcasts with increasing scopes are developed, and a randomized strategy for selecting
carried out. The scope of a broadcast is determined TTL values is shown to achieve the minimum worst-
by the TTL (Time to Live) value carried by a query case cost if the target distribution is unknown. Such
packet, which limits the number of hops the packet randomized strategy is pursued further in [7] to
can travel. A TTL-based scheme offers finer granular develop schemes that achieves lowest cost
search actions than a simple broadcast by varying its competitive ratio under constraint of varying delay
scope; and promises to reduce cost by trying search competitive ratio. An analytical model for TTL-
actions with smaller cost first in the hope to find the based search using generating function of degree

UbiCC Journal - Volume 3 49


distribution is described in [15]. Random walk is of Fi-1. The search continues until success (i=l). So
shown to be an effective search method in sensor the expected cost of a search scheme can be written
nets in [2], and its behavior is examined in [3][12]. as
Unstructured search in peer-to-peer networks is
treated in [4] focusing on target replication strategy. l
Search in graphs with a power-law degree
distribution is treated in [14]. Hybrid search schemes
E[C ] = ∑F
i =1
i −1Ci (1)

combining broadcast and random walk is discussed


in [12]. It is easy to show the above can be rewritten as
The contributions of this paper are as follows. l ⎛ i ⎞
Although there are much previous works on
unstructured search schemes, there is no systematic
E[C ] = ∑∑ ⎜

i =1 ⎝ j =1
C j ⎟ ( Fi −1 − Fi )

(2)

exploration of the performance landscape of all
unstructured search schemes in one place,
The above expression can recognized as a
delineating the feasible regions of performance standard formula for computing expected value. For
tradeoffs. This paper hopes to make progress in this
each term in the summation, (Fi - Fi+1) is the
direction: We build a unified analytical model for
probability that search sequence does not succeed in
unstructured search schemes, which is parameterized the (i-1)th action but succeeds in the ith action, and
by the granularities of the search sequences. We
the summation in the parenthesis is cost of a search
demonstrate through simulation that different sequence terminating at ith action.
unstructured search schemes exhibit very different We can use a similar approach to write the
cost and latency tradeoffs, leaving large gaps in the average latency of a search scheme but for a
performance landscape. We propose randomized cautionary note. The latency incurred by Ai generally
mixing schemes to bridge such performance gaps depends on whether the search is successful or not.
and bring out new Pareto frontier that offers more Consider, for example, a search action with TTL set
diverse performance choice for applications. to be 10. If the target is within the TTL scope, say 7
The paper is organized as follows. In Section 2, hops away, the latency is 7. But if the target is more
we build a unified model for the three unstructured
than 10 hops away, the search action fails and incurs
search schemes. In Sections 3-5, we deal with a latency of 10 regardless where the target is located.
broadcast, TTL-based, RW-based search schemes, We write D”i as the latency if the target is within the
respectively. We introduce mixed schemes in Section
scope of ith action, i.e., if the action is successful;
6, and conclude in Section 7. and D’i as the latency if the target is out of the scope,
i.e., if the search action is unsuccessful, which is
2 A UNIFIED PERFORMANCE MODEL fixed for a particular action. So, with probability (Fi
FOR UNSTRUCTURED SEARCH SCHEMES - Fi+1), i.e., the search sequence does not succeed in
the (i-1)th action but succeeds in the ith action, the
We consider the problem of searching for a
search latency is
target in a network of N nodes. A target is replicated
in m copies, locating any of the replicas caused the
D '1 + D '2 + ... + D 'i −1 + D "i
search successful. We consider a search scheme
consisting of a sequence of actions A = [A1, A2,…, Al],
where Al is the first search action, A2 the second Thus the expected latency can be written as
search action, and so forth, and Al is the terminating
action in which the target is found. For an l ⎛ i −1 ⎞
unstructured search scheme, there is no outside clue E[ D ] = ∑ ⎜⎜ ∑ D ' + D " ⎟⎟ ( F j i i −1 − Fi ) (3)
about the next search action except the history of i =1 ⎝ j =1 ⎠
previous search actions. In a broadcast search, there
is a single action: broadcast. In a TTL-based scheme, Rearranging terms, we have
each action is a limited broadcast with a particular
TTL value. In a RW-based scheme, each action is a E[D] = E[ DBCast ] + E[ D '] (4)
step in the walk. As we can see, the actions of
different search schemes have different granularities, where
which have performance implications. l
We write Ci as the cost of performing search
action Ai, Di as the average latency caused by Ai,Fi as
E[ DBCast ] ≡ ∑ D" (F
i =1
i i −1 − Fi ) (5)
the probability that Ai fails, with the convention F0 =
1, Fl = 0. The cost of the first search action is always l⎛ i −1 ⎞ l −1

paid outright, but that of a later action, Ci, is paid out E[ D '] ≡ ∑∑ ⎜

i =1 ⎝
D ' j ⎟ ( Fi −1 − Fi ) =
⎟ ∑F D' i i (6)
only if the previous action failed with the probability j =1 ⎠ i =1

UbiCC Journal - Volume 3 50


nodes in the network if the topology is a complete
In the above, E[DBcast] expresses the expected latency graph; whereas it incurs n/2-hop latency if the
for a search that expands in scope until successful topology is a ring of n nodes. Q.E.D.
without incurring the latency of failed actions. This In the following, we provide bounds on
part of latency is the same as the latency of a minimum expected cost and latency, and identify
broadcast search, which is independent of the search schemes that achieve these bounds. First, we
particular search scheme used and represents the define an idealized random walk.
minimum latency of any search scheme. E[D’] Definition 2.1 An idealized random walk is one
collects the latency incurred due to failed actions, that visits a distinctive node in every step.
and is dependant of the particular search scheme in Proposition 2.2 The minimum expected cost
question. among all unstructured search schemes is
The above formulism is general without regard
to any particular search scheme in question. Before N
proceed further, we list our assumptions.
A1 The cost of a search action (a single step) in a
Cmin = ∑ F ( j − 1)
j =1
(8)

random walk is 1.
A2 The cost of a search action that requires where F(j-1) is defined in the sense of (7). This
broadcasting to n nodes is n, i.e., Ci = Ni in minimum expected cost can be achieved by an
equations (1) and (2). idealized random walk.
A3 The m target replicas are independently, Proof: First, we show that for an arbitrary search
identically distributed among the nodes. scheme, its expected cost is at least Cmin. Suppose the
The above assumptions are admittedly idealistic, scheme’s search action sequence is [A1, A2,…, Al],
but they are used here to focus on the issue intrinsic and the corresponding node coverage sequence being
to the merits of a particular search scheme and [N1, N2,…, Nl = N]. It is clear that Ci >= ΔNi =Ni - Ni-
exclude external factors such as network conditions, 1, since it takes at least a cost of ΔNi to cover ΔNi
implementation efficiency, etc. Assumption A1 holds nodes. So we have
only if the links are lossless, which is not true in
practical situations. However, introducing loss l l l
requires specification of loss probability, which is
some extraneous detail outside the scope of our
E[C ] = ∑i =1
Fi −1Ci ≥ ∑
i =1
Fi −1ΔNi = ∑ F (N
i =1
i −1 ) ΔN i

discussion. Similarly, A2 holds only if the


implementation of broadcast is perfect, again more The right-hand side of the inequality cab be
idealism than realism. Broadcast is known to cause recognized as the discrete integration of Fi with a
redundancy and inefficiency, to mitigate which step size of ΔNi on the N-axis, with support intervals
methods have been devised [8]. Again, such details as N1 + ΔN2 + ΔN2 +… + ΔNl = [1, 2, … N]. Since Fi
fall out of the scope of the present discussion, and is a non-increasing function of Ni, so we have
similar approximation is used in [4]. Assumption A3
implies the probability of success depends only on l N
the number of nodes visited, and has nothing to do
with the identities of visited nodes. In other words,

i =1
Fi −1ΔN i ≥ ∑ F ( j − 1) =
j =1
the failure probability Fi is solely a function of nodes l
visited so far, Ni, i.e., ∑( F (N
i =1
i −1 ) + F ( Ni −1 + 1) + ... + F ( Ni −1 + ΔNi − 1) )

Fi = F ( Ni ) (7)
Second, an idealized random walk is a sequential
The particular form of F depends on the distribution visit of each node. Without loss of generality, let the
of the replicas. Now we are ready for the following sequence be j=1, 2,…, N, and thus incurs the expect
proposition. the minimum cost of
Proposition 2.1 A: The expected cost is
parameterized by and thus depends only on the N
search sequence [N1, N2,…, Nl], regardless the ∑ F ( j − 1) Q.E.D.
network topology. B: The expected latency depends j =1
on both the search sequence and network topology.
Proof: The validity of the statement A is The next proposition is obvious, and we state
apparent by combing equations (1) and (7). The without proof.
validity of the statement A derives from the fact that Proposition 2.2 The minimum expected latency
different latencies are incurred in different network is the minimum expected distance in hops between
topologies to cover the same number of nodes. For, the originator and any one of target replicas, which
example, it incurs just one-hop latency to cover all can be achieved by a broadcast search.

UbiCC Journal - Volume 3 51


In next three sections, we derive cost and latency number of nodes per square hop, depends on the
models for three classes of unstructured search nodal degree and the network topology in question.
schemes: broadcast, TTL-based, and RW-based. To Expressions of α can be obtained only for some
compute the expected cost and latency, one needs to special cases. For example, it is straightforward to
have a specific probability distribution for replicas in show that α = 1 for a square grid with nodal degree
the network. In the following, we assume replicas are of four, α = 2/√3 for hexagonal grid with nodal
distributed uniformly in the network, which is a degree of six, etc. Since the density only introduces a
common case, e.g., in DHT-based data lookup scalar factor, we will, without loss of generality, use
scheme in peer-to-peer networks [4]. Thus, given m the value of one in the following. Thus, in a 2-D
replicas, the failure probability of a search action that space, we have R = (N/π)1/2, and,
covers Ni nodes of N nodes in the network is
N Γ(m)
m E[ DBCast ] = m (11)
⎛ N ⎞ 3
Fi = ⎜1 − i ⎟ (9) 2Γ ( m + )
⎝ N⎠ 2

3 BRAODCAST SEARCH In passing, we note that for large m [9],

A broadcast search consists of a single action, 1


m−
2 e− m
⎛ ⎛ 1 ⎞⎞
i.e., broadcast. Its cost is fixed with E[CBCast] = N, Γ(m) ≅ 2π m ⎜1 + O ⎜ ⎟ ⎟
however the expected latency is not so trivial. To ⎝ ⎝ m ⎠⎠
make tractable analysis, we assume that a network
lives in a d-dimensional space, where the originator We have, for arbitrary dimension and large m,
resides in the center of the space. According to A3,
the probability that a target resides in a particular ⎛1 1⎞
−⎜ + ⎟
space is proportional to the volume of the space. The E[ DBCast ] ≅ Rm ⎝ 2 d ⎠ Γ( 1 + 1) (12)
probability that a target (at random distance x) is less d
than r hops away can be computed as

d
⎛r⎞ 4 TTL-BASED SEARCH SCHEMES
Pr ≡ P ( x ≤ r ) = ⎜ ⎟
⎝R⎠
A TTL-based search consists of a sequence of
broadcasts with increasing TTL values, with the last
where R is the radius of the network in hops. The action being a network-wide broadcast. To motivate
probability that minimum distance between the our analysis, we consider the first two actions in a
originator and the m target replicas is no larger than r TTL-based search, i.e., A1 and ,A2 with increasing
hops is TTL values, covering N1 and N2 nodes, incurring
costs of C1 and C2, respectively. Note that the node
m
⎡ ⎛ r ⎞d ⎤ coverage of A2 includes that of A1. We consider two
m m
Pr ≡ P ( x ≤ r ) = 1 − ⎢1 − ⎜ ⎟ ⎥ options: a) a sequence of search actions [A1, A2], or
⎢⎣ ⎝ R ⎠ ⎥⎦ b) a single action A2. Clearly, these two options have
the same probability of success at the conclusion of
And according to Proposition 2.2, the expected the search. However, their costs are different, i.e., Ca
latency of a broadcast search can be calculated as the = C1 + F1 C2 for option a, and Cb = C2, for option b.
expected minimum distance between originator and Option a is preferable to b only if C1 + F1C2 < C2.
the targets. Rearranging terms, we have the following lemma,
which was similarly stated in [6] but is derived here
m using simpler argument.
R L ⎡ ⎛ r ⎞d ⎤ Lemma 4.1 The cost of a search action A2 can
∫ ∫
m
E[ DBCast ] = rdPr = − rd ⎢1 − ⎜ ⎟ ⎥
0 0 ⎢⎣ ⎝ R ⎠ ⎥⎦ be reduced by using a sequence of search actions [A1,
A2], if the inequality holds:
1
Γ(m)Γ( + 1)
= mR d (10) C1
1 F1 < 1 − (13)
Γ(m + + 1) C2
d
Specializing to our model, and under the
where Γ( ) is the gamma function. For d = 2, which is assumptions A1-A3, we have C1 = N1, C2 = N2, and
our main focus here, we have N = απR2, with α being the inequality becomes
the node density. The node density, defined as the

UbiCC Journal - Volume 3 52


the condition below
m
⎛ N ⎞ N
F1 = ⎜1 − 1 ⎟ < 1 − 1 ∂E[CTTL ]
⎝ N⎠ N2 = − fi −1m + mpi +1 fi m −1 = 0 (15)
∂fi
It holds only if m > 1, since N is the total number of
nodes in the network and can not be less than N2. So Equation (15) provides a recursive relationship that
we have: determines fi+2 given fi+1 and fi (note that pi = 1 - fi).
Corollary 4.1 If m = 1, the cost of a search The optimal solution is given by testing the values of
action A2 is no more than that of a sequence of search the expected cost at fi’s given by the recursive
actions [A1, A2]. equation and at the boundary fi = 0 (since fi <= 1, and
We generalize Corollary 4.1 in the following fi’s are strictly decreasing after each search action).
proposition. Exact solution of the above set of nonlinear
Proposition 4.1 Under assumptions A1 and A3, equations is difficult. However, we can try different
and if the number of replicas is one, the cost of a f1’s (note that f0 = 0) and determine a family of
broadcast search is no more than that of a TTL-based sequences. The family of sequences provides
search. different cost and latency tradeoffs and includes one
Proof: We use Corollary 4.1 recursively. that has the minimum cost.
Consider a TTL-based search scheme consists of a We calculate the expected latency using (4) and
search sequence [A1, A2,…, Al]. Apply Corollary 4.1 (6).
to the subsequence [A1, A2], which can be replaced
l −1
with a single action A2, resulting in a new search
sequence [A2, A3,…, Al], which has a cost no more E[ DTTL ] = E[DBCast ] + ∑F D
i =1
i i
than that of the original sequence. Applying
l −1 l −1 m l −1 1
Corollary 4.1 again to the new sequence, and so forth, ⎛ N ⎞
and eventually we can reduce the original sequence ∑
i =1
Fi Di = ∑
i =1
hi ⎜ 1 − i ⎟ = β
⎝ N ⎠ ∑
i =1
pi d fi m
to a single action Al, i.e., a broadcast, with a cost no
more than that of the original sequence. Q.E.D.
The above proposition implies that a TTL-based where hi is the TTL value (hop count) for the ith
search is competitive to a broadcast search only if m action, which can be expressed as
>1.
1
We now proceed to calculate the expected cost
and latency of a TTL-based scheme. In our model, a hi = β pi d
TTL-based search scheme consists of a sequence of
actions, A1, A2, …Al, with Ci = Ni, and Cl = N, where where β being a constant determined by network
Ni is the number of nodes covered in the ith round. topology.
Using (1), We calculate the expected cost as For 2-D space, which is our interest here, we
have, noting that is node density is 1, π hi 2 = pi N ,
l l m
⎛ N ⎞
E[CTTL ] = ∑i =1
Fi −1Ci = ∑
i =1
Ni ⎜ 1 − i −1 ⎟
⎝ N ⎠
and thus β = π , and therefore
(14)
l l −1
1
=N ∑ pi fi −1m E[ DTTL ] = E[ DBCast ] +
π
∑ i =1
N i fi m (16)
i =1
Ni
where pi ≡ and fi ≡ 1 − pi . Simulation results are shown in Figure 1 for a
N
network with N = 1000, and m = 2, 4,.., 10. The
network topology is generated randomly with
Clearly, the expected cost depends on the vectors
average degree of five. The expected cost and
{fi} or its complement {pi}, where fi indicates the
latency as a function of p1 (the proportion of nodes
proportion of nodes covered in the ith search action,
covered in the first action) are plotted in the figure.
and pi being its complement. We choose {pi} or {fi}
From the figure one can see that the expected cost is
to minimize the expected cost. Obviously, fi and pi,
small in the region of small p1, deceasing in a very
being probabilities, are constrained within the
gradual way to a minimum, and then rising quickly
interval of [0, 1]. Thus, we have a constrained
to become large at large p1, see Figure 1 (a). The
optimization problem. According to the Kuhn-
implication is that the expected cost is insensitive to
Tucker conditions for constrained optimization, the
the selection of p1 as long as it is small. This turns
fi’s that minimize the expected cost fall into two
out to be quite useful in practical implementation
cases: a) located at the constrain boundary, i.e., fi = 0
because there is no need to painstakingly search for
or 1, or b) located inside the interval and subject to

UbiCC Journal - Volume 3 53


the optimal p1, any choice would be fine as long as it fundamental problem for MRW is that there is no
is a small number. communication between walkers once they are
released. So even if one walker finds the target, other
1000
walkers would continue until they individually find
m=2
the target, incurring large cost. One way to rectify
900 m=4
m=6 this problem is to make random walkers terminate
m=8
800 m=10 probabilistically at each additional step. However,
700 this leaves open the possibility that all the walkers
600
might terminate before the target is found. Our
solution is to make one persistent walker that never
E[C]

terminates, assisted by k non-persistent walkers,


500

400
which survive with probability of q at each step. The
300 persistent walker guarantees the search is always
200
successful, while the non-persistent walkers
speculatively wander away trying to reduce latency.
100
In the following, we discuss expected cost and
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 latency of SRW and MRW sequentially.
p1
We model a SWR search scheme as a sequence
(a) of actions, A1, A2, …Al, with Ci = 1, Di= 1, and l >= N.
20
Let si indicate the number of distinct nodes visited in
the first i steps, then probability of failure to find any
18
m=2
m=4
of m replicas of the target is given by (since search
16 m=6
m=8 for each replica has to fail),
m=10
m
14
⎛ s ⎞
Fi = ⎜ 1 − i ⎟ = fi m (17)
12
⎝ N ⎠
E[D]

10

8
si
where fi = 1 − (18)
6
N
4

2 We calculate the expected cost using (1)


0
l −1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
l

∑ ∑f m
p1

E[CSRW ] = Fi −1Ci = i (19)


(b)
i =1 i =0

Figure 1: Expected cost E[C] (a) and latency E[D] (b)


versus p1 for TTL-based schemes. We compute expected latency as

l l −1

The expected latency shows an opposite trend: it E[ DSRW ] = ∑F


i =1
i −1 Di = ∑f
i =0
i
m
(20)
increases quickly to a maximum in the small p1
regime, then decreases monotonically and eventually
saturates at large p1, see Figure 1 (b). Particularly, The expected latency can be computed the same
the maximum latency occurs roughly around the way as the cost since each step incurs a fixed latency
point where minimum cost occurs, indicating that (one unit), which is not true in other search schemes.
reduced expected cost comes at the expense of A distinguishing fact for SRW is that the expected
increased latency. cost is the same as the expected latency. For other
search schemes, expected cost is generally larger
5 RW-BASED SEARCH SCHEMES than the expected latency. This is because, for an
action covering Ni nodes, the cost is paid outright as
In a RW-based search, a query packet visits Ni, but the latency can be smaller than Ni if the target
nodes sequentially and terminates when the target is is found before every node is covered. Such
found. There are a variety of ways to carry out a inequality does not apply to SRW because each step
RW-based search. One way is to employ a single is elementary, incurring unit cost and latency.
random walker (SRW). Such a search, if it is ideal, There remain to be determined the values of si.
incurs minimal cost according to Proposition 2.2, but There is no exact expression for si, though
it also causes large latency. To lower latency, asymptotic expressions do exist [10]. One such
multiple random walkers (MRW) can be released expression says
simultaneously, but with large increase in cost. A

UbiCC Journal - Volume 3 54


si → (1 − PR )i as i → ∞ and are avoided unless all neighbors are flags, in
which the walker marches on a flagged node
where RR is probability that the walker will return to regardless. The expected cost and latency as a
the originator. But such asymptotic expression is not function of q, the survival probability, are plotted in
useful. For the use of a RW-based search makes little Figure 2, which is the case for m = 2. Plots for other
sense if i gets as large as N, let alone infinity, m values show similar feature, thus are omitted.
because one might as well use broadcast with the
same cost but much less latency. We think that, in a 750

practical implementation, one will not use pure k=1


700
random walk simply for the sake of its theoretical k=2
k=3

purity, because it is patently inefficient, i.e., visiting


k=4
650
k=5

previously visited nodes repeatedly. Instead, some 600

optimization will be used in practical implementation,


such as using a flag to indicate previous visits.
550

E[C]
Therefore, in a practical implementation, the 500

behavior of a random walker approaches that of a 450

idealized walker, i.e., si Æi, to the extent that


optimization is done.
400

We now turn to MRW. In MRW, in addition to 350

one persistent random walker that terminates only if 300

the target is found, k independent, identical, non-


0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

persistent random walkers are also released, each


(a)
having a probability of 1-q to terminate at each step.
Since the random walkers are independent, we 340

calculate the expected cost of MRW as


330

⎡ l −1 ⎤ l −1

∑ ∑f
k=1

E[CMRW ] = k ⎢ Pi fi m ⎥ + m 320

i (21) k=2
k=3

⎣⎢ i =0 ⎦⎥
k=4
i =0 310
k=5
E[D]

where Fi is given by (17), and Pi is the probability 300

that a walker is still alive after ith step, and is given


by
290

280

q i − q l +1
Pi = (22)
1 − q l +1
270
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
q

(b)
The expected latency is determined by the
minimum of those among multiple walkers, whose Figure 2: Expected cost E[C] (a) and latency E[D] (b)
population at ith step is versus q for RW-based scheme

Ki = 1 + kPi (23)
From the figure, one can see that large
The search continues only if all Ki walkers fail to performance variation occurs only where q is close
find any of m replicas, so the expected latency is to 1. Roughly after q is smaller than 0.9, the
expected cost and latency fast converges to those
l −1
achieved by a single persistent walker. Figure 2 also
E[ DMRW ] = ∑f i =0
i
mKi
(24) shows the tradeoff between expected cost and
latency, i.e., latency can be reduced, at the expense
of increased cost, by either increasing the number of
By varying k and q, we can construct a family of non-persistent walkers (k) or reducing the survive
RW-based schemes. Simulation results are shown in probability (q). An important point, which will be
Figure 2 for a network with N = 1000, random elaborated in the next section, is that the cost-latency
topology and average degree being five, and with tradeoff here exhibits a very different feature from
one persistent walker plus one to five non-persistent that of TTL-based schemes, comparing Figures 1 and
walkers. To approximate practical, optimized 2.
implementation, previously visited nodes are flagged,

UbiCC Journal - Volume 3 55


6 MIXED SEARCH SCHEMES 1000

BCast
In this section, we examine more closely the 900

performance landscape of different unstructured 800

search schemes, and introduce mixed schemes to 700


TTL

provide more diverse cost and latency tradeoffs. In 600


RW

the following, we will simplify notion by using CA,

E[C]
DA to indicate E[CA], E[DA] of a particular search 500

scheme A, respectively. We start by laying down 400

some preparations. 300

Definition 6.1 The feasible region of a search 200

scheme in the performance space consists of the


union of expected cost and latency pairs, each
100

realizable using an instance of the search scheme. 0


0 50 100 150 200 250 300 350

Definition 6.2 A search scheme A dominates a E[D]

search scheme B if C A ≤ CB ,and DA ≤ DB . (a)


Definition 6.3 A search scheme is Pareto- 1000

optimal [11] if it is dominated by no other search 900


scheme, and the set of all search schemes that are BCast

Pareto-optimal forms the Pareto-frontier. 800

Definition 6.4 A mixed scheme between two 700

search schemes A and B is a randomized scheme that 600 TTL

selects A with a certain probability p and B with the


E[C]

500

probability 1-p.
Proposition 6.1 The feasible region of a mixed
400
RW

scheme is convex. 300

Proof: Consider two schemes A and B in the 200

feasible region, with expected cost and latency of CA, 100

CB, DA, DB, respectively. For any p in [0, 1], a mixed 0


scheme with expected cost and latency of pCA+(1- 0 20 40 60 80 100 120 140 160 180

E[D]
p)CB, pDA+(1-p)DB, is realizable. Therefore the
feasible region is convex. Q.E.D. (b)
Equipped with the above background, let us 1000

examine the cost-latency tradeoffs of three types 900 BCast


unstructured search schemes discussed previously.
Simulation results for a network of 1000 nodes,
800

random topology and average degree being five, are 700

shown in Figure 3. Broadcast consists of a point in 600


TTL

the performance region, incurring minimum latency


E[C]

500

but largest cost. TTL-based schemes consist of a 400


family of schemes parameterized by p1, the
proportion of nodes covered by the first action. TTL- 300

RW
based schemes achieve a slightly larger latency than 200

broadcast but can incur significantly lower cost, 100

especially for large m (number of replicas) and 0


optimized p1, see Figure 3(c). RW-based schemes 0 10 20 30 40 50

E[D]
60 70 80 90 100

consist of a family of schemes parameterized by k (=


(c)
1-5), the initial number of walkers, and q, the
survival probability. In the figure, individual curves Figure 3: Expected cost E[C] versus latency E[D] for
correspond to a particular value of k, with large k broadcast (BCast), TTL-based, and RW-based search
curves lying higher. RW-based schemes can achieve schemes, shown with m=2 (a), m=5 (b) and m=10 (c).
lowest cost, especially for small m, see Figure 3(a); Dotted lines indicate Pareto frontiers of the mixed
but it incurs largest latency. scheme.
A remarkable fact about Figure 3 is that different
search schemes exhibit very different cost-latency In the following, we focus on TTL-based and
tradeoffs, leaving conspicuous gaps between them. RW-based schemes, omitting broadcast because of
its deficiency. We call a TTL-based scheme as
latency-preferred since its expected latency is low
and insensitive to the parameter (p1), which is largely

UbiCC Journal - Volume 3 56


limited by the E[DBCast] term in (16). However, its
expected cost can vary widely with the selection of 7 CONCLUSION
the parameter.
On the other hand, we call RW-based scheme as Unstructured search have high potential for
cost-preferred since it exhibits the opposite behavior: large-scale, highly dynamic networks, and it
its expected cost is low and relatively insensitive to provides a rich field for research endeavors.
choice of the parameters (k, q), but its expected Broadcast, TTL-based, RW-based schemes represent
latency vary widely with selection of the parameters. some typical instances of unstructured search.
Further explanation, based on the analytical model, However, there exists large gap in performance
of difference in cost-latency characteristic appears in landscape between these pure schemes. Mixed
the appendix. schemes can help to bridge the performance gap.
TTL-based and RW-based schemes are modeled This is important because different applications
after broadcast and idealized random walk, which require different performance tradeoff, which the
incurs minimum latency and cost, respectively. mixed scheme can provide where a pure scheme can
Therefore, the fact that TTL-based schemes are not.
latency-preferred and RW-based schemes are cost-
preferred is not surprising in retrospect. Due to the
peculiar performance characteristic of each type of
scheme, any particular type may not satisfy the E[C]
LA
requirement of an application. Applications with
P1
varying cost-latency requirements can be best served
if a more diverse Pareto frontier is accessible. This
PA
can be accomplished by mixing TTL-based and RW- P2
C0
based schemes. The dotted lines in Figure 3 indicate
the Pareto frontiers of the mixed scheme, and is LB

constructed based on the following proposition.


Proposition 6.2 The Pareto frontier of mixing PB
two families of search schemes A and B consists of
parts of the Pareto frontiers of A and B, namely LA D0
and LB, and the Pareto frontier of the linear E[D]
combination of points on LA and LB.
Figure 4: Pareto frontier of a mixed scheme can satisfy
Proof: We prove that any point, P, in the
application requirement (shaded region) otherwise not
feasible region of the mixed scheme is dominated by satisfied by a pure scheme.
the proposed Pareto frontier of in the proposition. P
can come from only three sources: feasible region of
a pure A scheme, that of a pure B scheme, and linear 8 APPENDIX
combinations (mixing) of scheme A and B. In case
that P is from a pure A region, P is dominated by We provide more detailed explanation, based on
some point on LA. In case that P is from a pure B the analytical models developed earlier, the cost-
region, P is dominated by some point on LB. In case latency tradeoffs of TTL-based and RW-based
that P is from a linear combination of points P1 of A schemes, which can be quantified by the cost-latency
scheme and P2 of B scheme, P is dominated by a slope, defined in an obvious way as dC/dD.
linear combination of points PA and PB, where PA and For TTL-based scheme, using (14) and (16), we
PB are on LA and LB and dominates P1 and P2, have
respectively, refer to Figure 4. Q.E.D.
We provide a simple example to demonstrate the dCTTL
value of a mixed scheme. Suppose an application has − ( N + C '2 + )
dCTTL dp1
the requirement: E[C] < C0 and E[D] < D0. Such = =
requirement is satisfied only by a mixed scheme and dDTTL dDTTL f1m ⎛ p1 1 ⎞
dp1
m N ⎜ − ⎟ + D '2 +
is not satisfied by either of the pure schemes, refer to π p1 ⎝ f1 2m ⎠
Figure 4.
A cautionary note before we conclude: a mixed
where in the last expression we separate out the
scheme achieves the cost-latency point only in the
leading term and the rest of the terms in dCTTL/dp1
expected sense, but the variance can be large.
and dDTTL/dp1, respectively. Exact evaluation of
However, often times application quality of service
dCTTL/dDTTL is difficult, but we can gain some insight
requirements are expressed in terms of expected
by just examining the ratio of the leading terms,
performance metrics, and mixed schemes are useful
since the individual terms decays rapidly as O(fim),
to achieve performance tradeoffs otherwise
and with significant jumps in TTL value (thus fi) for
inaccessible to pure schemes.
successive search actions. The ratio of the leading

UbiCC Journal - Volume 3 57


terms (LT) is Randomization, in Proc. ACM MobiCom (2004).
[7] N. Chang and M. Liu: Controlled Flooding
⎛ dCTTL ⎞ ⎛ N ⎞ Search with Delay Constraints, in Proc. IEEE
⎜ ⎟ = O ⎜⎜ m ⎟⎟ INFOCOM (2006).
⎝ dDTTL ⎠ LT ⎝ mf1 ⎠ [8] Ni, S.Y., Tseng, Y.C., Chen, Y.S., Sheu, J.P.:
The Broadcast Storm Problem in a Mobile Ad
which explains the large cost-latency slope of TTL- Hoc Network, in Proc. ACM MobiCom (1999).
based schemes in Figure 3 (because N is large). [9] D. Zwillinger: CRC Standard Mathematical
We proceed in a similar manner with the RW- Tables and Formulae, Chapman & Hall/CRC
based schemes, the difference being that now we (2003).
have two parameters (k, q), and that fi decreases [10] B. H. Hughes: Random Walks and Random
slowly without jump (decrement being at most 1/N). Environments, Volume 1: Random Walks,
Thus a leading term approximation does not provide Clarendon Press, Oxford (1995).
much insight here. We write the full expressions [11] M. J. Osborne, and A. Rubenstein: A Course in
below, using (21) and (24). Game Theory, MIT Press (1994).
[12] M. M. Christos Gkantsidis and A. Saberi:
l −1 Random walks in peer-to-peer networks, in Proc.

∂CRW
∂CTTL
∑P f i i
m
of INFOCOM (2004).
= ∂k = i =0 [13] Gkantsidis, C., Mihail, M., Saberi, A.: Hybrid
∂DRW ∂DTTL l −1 search schemes for unstructured peer-to-peer
q
∂k q
m ∑P f
i =0
i i
mKi
ln fi networks, in Proc. of INFOCOM (2005)
[14] L. A. Adamic, R. M. Lukose, B. Huberman, and
l −1 A. R. Puniyani: Search in power-law networks,
dPi
∂CRW
∂CTTL
∂q
∑ dq fi m Physical Review E., vol. 64, no. 046135 (2001).
= = i =0 [15] R. Gaeta, G. Balbo, S. Bruell, M. Gribaudo, M.
∂DRW ∂DTTL l −1
dPi Sereno: A simple analytical framework to
k
∂q k
m ∑ dq
i =0
fi mKi ln fi analyze search strategies in large-scale peer-to-
peer networks, Performance Evaluation, Vol. 62,
Issue 1-4, (2005)
From the above, we can explain the smaller cost-
latency slope by the absence of the √N term as with
TTL-based schemes. Further, we can attribute the m
factor in the denominator to the fact that the slope
becomes smaller with larger m, as shown in Figure 3.

9 REFERENCES

[1] E. M. Royer and C.-K. Toh: A Review of


Current Routing Protocols for Ad Hoc Mobile
Wireless Networks, IEEE Personal
Communications, Vol 6, no. 2, pp. 46-55 (1999).
[2] D. Braginsky and D.Estrin: Rumor Routing
Algorithm for Sensor Networks, in Proc.
International Conference on Distributed
Computing Systems (2002).
[3] S. Shakkottai: Asymptotics of Query Strategies
over a Sensor Network, in Proc. IEEE
INFOCOM (2004).
[4] E. Cohen and S. Shenker: Replication Strategies
in Unstructured Peer-to-Peer Networks, in Proc.
ACM SIGCOMM (2002).
[5] Y. Baryshinikov, E. Coffman, P. Jelenkovic, P.
Momcilovic, and D. Rubenstein: Flood Search
under the California Split Rule, Operations
Research Letters, vol. 32, no. 3, pp. 199-206,
(2004).
[6] N. Chang and M. Liu: Revisiting the TTL-based
Controlled Flooding Search: Optimality and

UbiCC Journal - Volume 3 58


Impact of Query Correlation on Web Searching
Ash Mohammad Abbas
Department of Computer Engineering
Zakir Husain College of Engineering and Technology
Aligarh Muslim University, Aligarh - 202002, India.

Abstract— Correlation among queries is an important factor to on the Web are much higher than the user which is simply
analyze as it may affect the results delivered by a search engine. retrieving some information from a traditional database. This
In this paper, we analyze correlation among queries and how makes the task of extracting information from the Web a bit
it affects the information retrieved from the Web. We analyze
two types of queries: (i) queries with embedded semantics, and challenging [1].
(ii) queries without any semantics. In our analysis, we consider Since the Web searching is an important activity and the
parameters such as search latencies and search relevance. We results obtained so may affect decisions and directions for
focus on two major search portals that are mainly used by individuals as well as for organizations, therefore, it is of
end users. Further, we discuss a unified criteria for comparison utmost importance to analyze the parameters or constituents
among the performance of the search engines.
involved in it. Many researchers have analyzed many different
Index Terms— Query correlation, search portals, Web infor- issues pertaining to Web searching that include index quality
mation retrieval, unified criteria for comparison, earned points. [2], user-effort measures [3], Web page reputation [6], and user
perceived quality [7].
In this paper, we try to answer the following question: What
I. I NTRODUCTION happens when a user fires queries to a search engine one by
The Internet that was aimed to communicate research ac- one that are correlated? Specifically, we wish to evaluate the
tivities among a few universities in United States has now effect of correlation among the queries submitted to a search
become a basic need of life for all people who can read and engine (or a search portal).
write throughout the world. It has become possible only due Rest of this paper is organized as follows. In section II, we
to the proliferation of the World Wide Web (WWW) which is briefly review methodologies used in popular search engines.
now simply called as the Web. The Web has become the largest In section III, we describe query correlation. Section IV
source of information in all parts of life. Users from different contains results and discussion. In section V, we describe a
domains often extract information that fits to their needs. criteria for comparison of search engines. Finally, section VI
The term Web information retrieval1 is used for extracting is for conclusion and future work.
information from the Web.
Although, Web information retrieval finds its roots to tra- II. A R EVIEW OF M ETHODOLOGIES U SED IN S EARCH
ditional database systems [4], [5]. However, the retrieval of E NGINES
information from the Web is more complex as compared to the
First we discuss a general strategy employed for retrieving
information retrieval from a traditional database. This is due
information from the Web and then we shall review some of
to subtle differences in their respective underlying databases2 .
the search portals.
In a traditional database, the data is often organized, limited,
and static. As opposed to that the Webbase is unorganized,
unlimited, and is often dynamic. Every second a large number A. A General Strategy for Searching
of updates are carried out in the Webbase. Moreover, as A general strategy for searching information on the Web is
opposed to a traditional database which is controlled by a shown in Fig. 1. Broadly a search engine consists of the fol-
specific operating system and the data is located either at lowing components: User Interface, Query Dispatcher, Cache 3 ,
a central location or at least at a few known locations, the Server Farm, and Web Base. The way these components
Webbase is not controlled by any specific operating system interact with one another depends upon the strategy employed
and its data may not reside either at a central site or at few in a particular search engine. We describe here a broad view.
known locations. Further, the Webbase can be thought as a An end user fires a query using an interface, say User Interface.
collection of a large number of traditional databases of various The User Interface provides a form to the user. The user fills
organization. The expectations of a user searching information the form with a set of keywords to be searched. The query
1 The terms Web surfing, Web searching, Web information retrieval, Web goes to the Query Dispatcher which, after performing some
mining are often used in the same context. However, they differ depending refinements, sends it to the Cache. If the query obtained after
upon the methodologies involved, intensity of seeking information, and
intentions of users who extract information from the Web. 3 We use the word Cache to mean Search Engine Cache i.e. storage space
2 Let us use the term Webbase for the collection of data in case of the Web, where results matching to previously fired queries or words are kept for future
in order to differentiate it from the traditional database. use.
UbiCC Journal - Volume 3 59
2 4
1 U 3
S Query 5
query E Dispatcher
R
I 8
N 7
T
WEB BASE
E
R
response F Server Farm
Cache
A
C
9 E 6

Fig. 1. A general strategy for information retrieval from the Web.

refinement4 is matched to a query in the Cache, the results are search engine may not search words that are not part of its
immediately sent by the Query Dispatcher to the User Interface ontology. It can modify its ontology with time. One step more,
and hence to the user. Otherwise, the Query Dispatcher sends an ontology based search engine may also shorten the set of
the query to one of the Server in the Server Farm which are results searched before presenting it to the end users that are
busy in building a Web Base for the search engine. The server not part of the ontology of the given term.
so contacted, after due consideration from the Web Base sends We now describe an important aspect pertaining to informa-
it to the Cache so that the Cache may store those results tion retrieval from the Web. The results delivered by a search
for future reference, if any. Cache sends them to the Query engine may depend how the queries are formulated and what
Dispatcher. Finally, through the User Interface, response is relation a given query has with previously fired queries, if any.
returned to the end user. We wish to study the effect of correlation among the queries
In what follows, we briefly review the strategies employed submitted to a search engine.
by different search portals.
III. Q UERY C ORRELATION
B. Review of Strategies of Search Portals The searched results may differ depending upon whether
The major search portals or search engines5 which end users a search engine treats a set of words as an ordered set or an
generally use for searching are GoogleTM and YahooTM . Let unordered set. In what follows, we consider each one of them.
us briefly review the methodologies behind their respective
search engines6 of these search portals. A. Permutations
Google is based on the PageRank scheme described in [8].
Searched results delivered by a search engine may depend
It is somewhat similar to the scheme proposed by Kleinberg in
upon the order of words appearing in a given query7. If we
[9] which is based on hub and authority weights and focuses
take into account order of words, the same set of words may
on the citations of a given page. To understand the Google’s
form different queries for different orderings. The different
strategy, one has to first understand the HITS (Hyperlink-
orderings of the set of words of the given query are called
Induced Topic Search) algorithm proposed by Klienberg. For
permutations. The formal definition of permutations of a given
that the readers are directed to [9] for HITS and to [8] for
PageRank.
query is as follows. 
Definition 1: Let the query Q wi  1  i  m, Q  φ, be
On the other hand, Yahoo employs an ontology based
a set of words excluding stop words of a natural language. Let
P  x j
 1  j  m be a set of words excluding stop words.
search engine. An ontology is a formal term used to mean a
hierarchical structure of terms (or keywords) that are related.
If P is such that wi x j for some j not necessarily equal to
The relationships among the keywords are governed by a set
i, and wi Q  x j P such that wi x j where j may not be
of rules. As a result, an ontology based search engine such
equal to i, then P is called a permutation of Q.
as Yahoo may search other related terms that are part of
In the above definition, stop words are language dependent.
the ontology of the given term. Further, an ontology based
For example in the English language, the set of stop words,
4
By refinement of a query, we mean that the given query is transformed S, is often taken as
in such a way so that the words and forms that are not so important are 
eliminated so that they do not affect the results. S a an  the  is  am  are  will  shall  of  in  for 
5 A search engine is a part of search portal. A search portal provides many
other facilities or services such as Advanced Search, News etc. 7 The term ’query’ means a set of words that is given to a search engine to
6 The respective products are trademarks of their organizations. search for the information available on the Web.
UbiCC Journal - Volume 3 60
1
Note that if there are m words (excluding the stop words) in Google
Yahoo

the given query, the number of permutations is m!.


The permutations are concerned with a single query. Sub- 0.8

mitting different permutations of the given query to a search


engine, one may evaluate how the search engine behaves for 0.6

Latency
different orderings of the same set of words. However, one
would like to know how the given search engine behaves when 0.4

an end user fires different queries that may or may not be


related. Specifically, one would be interested in the behavior 0.2

of a given search engine when the queries are related. In what


follows, we discuss what is meant by the correlation among 0
1 2 3 4 5 6 7 8 9 10
different queries. Page Number

Fig. 2. Latency versus page number for permutation P1.


B. Correlation
An important aspect that may affect the results of Web 1
Google
searching is how different queries are related. Two queries Yahoo

are said to be correlated if there are common words between 0.8


them. A formal definition of correlation among queries is as
follows.
0.6
Definition 2: Let Q1 and Q2 be queries given to a search

Latency
engine such that Q1 and Q2 are sets of words of a natural
language and Q1  Q2  φ. Q1 and Q2 are said to be correlated 0.4

if and only if there exists a set C Q1 Q2 , C  φ.


One may use the above definition to define k-correlation 0.2

between any two queries. Formally, it can be stated as a


corollary of Definition 2. 0
1 2 3 4 5 6 7 8 9 10

Corrollary 1: Two queries are said to be k-correlated if and Page Number

only if  C  k, where  denotes the cardinality. Fig. 3. Latency versus page number for permutation P2.
For two queries that are correlated, we define a parameter
called Correlation Factor8 as follows. 1
Google
Yahoo
 Q1 Q2 
Correlation Factor  (1)
 Q1  Q2  0.8

This is based on the fact that  Q1  Q2   Q1  Q2 


 Q1 Q2  . 0.6

Note that 0  Correlation Factor  1. For two uncorrelated


Latency

queries the Correlation Factor is 0. Further, one can see from 0.4

Definition 1 that for the permutations of the same query,


Correlation Factor is 1. 0.2

Similarly, one may define the Correlation Factor for a


cluster of queries. Let the number of queries be O. The 0
1 2 3 4 5 6 7 8 9 10
cardinality of the union of the given cluster of queries is given Page Number

by the following equation. Fig. 4. Latency versus page number for permutation P3.

O
 Qo  ∑ Qi ∑
   Qi  Q j  ∑  Qi  Qj  Qk  1
Google
o 1 i i j i j  k Yahoo

O 1
 1  Q1  Q2    QO  (2) 0.8

Using (2), one may define the Correlation Factor of a cluster


of queries as follows. 0.6
Latency

O

o 1 Qo 
Correlation Factor O
(3) 0.4
  o 1 Qo 
A high correlation factor means that the queries in the cluster 0.2

are highly correlated, and vice versa.


In what follows, we discuss results pertaining to query
0
correlation. 1 2 3 4 5 6
Page Number
7 8 9 10

8 This correlation factor is nothing but Jaccard’s Coefficient, which is often Fig. 5. Latency versus page number for permutation P4.
used as a measure of similarity.
UbiCC Journal - Volume 3 61
TABLE I
S EARCH LATENCIES , QUERY SPACE , AND THE NUMBER OF RELEVANT RESULTS FOR DIFFERENT PERMUTATIONS OF THE QUERY: Ash Mohammad Abbas
FOR G OOGLE .
Permutation p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
1 0.22 0.15 0.04 0.08 0.33 0.29 0.15 0.13 0.16 0.17
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
8 5 1 0 2 0 0 3 0 0
2 0.51 0.15 0.22 0.19 0.13 0.12 0.10 0.27 0.16 0.15
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
3 2 2 1 1 0 2 0 0 0
3 0.30 0.08 0.18 0.20 0.14 0.25 0.13 0.21 0.14 0.21
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
6 4 1 3 2 1 1 0 0 0
4 0.60 0.07 0.35 0.11 0.13 0.15 0.23 0.13 0.28 0.26
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
3 0 2 1 0 0 2 0 1 1
5 0.38 0.09 0.39 0.14 0.17 0.15 0.14 0.16 0.15 0.13
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
3 2 1 2 1 0 0 1 1 1
6 0.36 0.15 0.10 0.12 0.18 0.17 0.15 0.13 0.20 0.15
300000 300000 300000 300000 300000 300000 300000 300000 300000 300000
5 4 1 3 0 2 1 2 2 0

TABLE II
S EARCH LATENCIES , QUERY SPACE , AND THE NUMBER OF RELEVANT RESULTS FOR DIFFERENT PERMUTATIONS OF THE QUERY: Ash Mohammad Abbas
FOR YAHOO .
Permutation p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
1 0.15 0.15 0.27 0.24 0.25 0.23 0.34 0.21 0.27 0.30
26100 26400 26400 27000 27000 26900 26900 26900 25900 25900
10 4 1 0 0 1 0 0 0 0
2 0.18 0.13 0.20 0.15 0.19 0.10 0.15 0.09 0.12 0.13
26900 27000 27000 26900 25800 26900 26900 26800 26800 26800
4 6 1 1 0 1 1 1 0 0
3 0.12 0.11 0.15 0.14 0.11 0.10 0.11 0.12 0.09 0.13
26900 27100 26900 26900 26500 26800 26800 26500 26500 26700
10 3 1 2 0 0 0 0 0 0
4 0.03 0.10 0.14 0.13 0.12 0.20 0.10 0.19 0.12 0.17
27000 26400 26400 26700 27000 26700 26400 26900 26800 26800
7 4 0 2 1 0 0 1 0 1
5 0.12 0.12 0.20 0.08 0.13 0.10 0.12 0.09 0.13 0.20
26400 26800 26800 26800 26800 26700 26700 26800 26700 26200
8 5 1 1 0 0 0 0 0 1
6 0.16 0.10 0.16 0.12 0.13 0.11 0.10 0.11 0.12 0.15
27100 26700 27100 26700 26600 27000 26600 26900 26500 26500
10 5 0 0 0 1 0 0 0 0

1 1
Google Google
Yahoo Yahoo

0.8 0.8

0.6 0.6
Latency
Latency

0.4 0.4

0.2 0.2

0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Page Number Page Number

Fig. 6. Latency versus page number for permutation P5. Fig. 7. Latency versus page number for permutation P6.

IV. R ESULTS AND D ISCUSSION


classes of search engines. As mentioned earlier, Yahoo is based
The search portals that we have evaluated are Google and on ontology while Google is based on page ranks. Therefore,
Yahoo. We have chosen them because they represent the search if one selects them, one may evaluate two distinct classes of
portals that majority of end users in today’s world use in search engines.
their day-to-day searching. One more reason behind choosing The search environment is as follows. The client from where
them for performance evaluation is that they represent different queries were fired was a Pentium III machine. The machine
UbiCC Journal - Volume 3 62
0.65
Google:Q1 was part of a 512Kbps local area network. The operating
Google:Q2
0.6
Yahoo:Q1
Yahoo:Q2 system was Windows XP.
0.55 In what follows, we discuss behavior of search engines for
0.5
different permutations of a query.
0.45
Latency

0.4
A. Query Permutations
0.35
To see how a search engine behaves for different permuta-
0.3
tions of a query, we consider the following query.
0.25

Ash Mohammad Abbas


0.2
1 1.5 2 2.5 3 3.5 4
Correlation
The different permutations of this query are
Fig. 8. Latency versus correlation for queries with embedded semantics.
1 Ash Mohammad Abbas
1.1
2 Ash Abbas Mohammad
Google:Q1
Google:Q2
Yahoo:Q1
3 Abbas Ash Mohammad
1 Yahoo:Q2
4 Abbas Mohammad Ash
0.9
5 Mohammad Ash Abbas
0.8 6 Mohammad Abbas Ash
0.7
We have assigned a number to each permutation to differ-
Latency

0.6
entiate from one another. We wish to analyze search results on
0.5 the basis of search time, number of relevant results and query
0.4
space. The query space is nothing but the cardinality of all
0.3
results returned by a given search engine in response to a given
query. Note that search time is defined as the actual time taken
0.2
1 1.5 2 2.5
Correlation
3 3.5 4 by the search engine to deliver the results searched. Ideally, it
Fig. 9. Latency versus correlation for random queries.
does not depend upon the speeds of hardware, software, and
network components from where queries are fired because it is
the time taken by the search engine server. Relevant results are
0.65
Google:Q1
Google:Q2
those which the user intends to search. For example, the user
Yahoo:Q1
0.6 Yahoo:Q2 intends to search information about Ash Mohammad Abbas9.
0.55 Therefore, all those results that contain Ash Mohammad Abbas
0.5 are relevant for the given query.
0.45
In what follows, we discuss the results obtained for different
Latency

0.4
permutation of a given query. Let the given query be Ash
Mohammad Abbas. For all permutations, all those results that
0.35
contain Ash Mohammad Abbas are counted as relevant results.
0.3
Since both Google and Yahoo deliver the results page wise,
0.25 therefore, we list all parameters mentioned in the previous
0.2 paragraph page wise. We go up to 10 pages for both the search
1 1.5 2 2.5 3 3.5 4
Correlation engines as the results beyond that are rarely significant.
Fig. 10. Query Space versus correlation for queries with embedded semantics. Table I shows search latencies, query space, and the number
of relevant results for different permutations of the given query.
1.1
Google:Q1
The search portal is Google. Our observations are as follows.
Google:Q2
1
Yahoo:Q1
Yahoo:Q2 For all permutations, the query space remains the same
0.9 and it does not vary along the pages of results.
0.8
The time to search the first page of the results in response
to a the given query is the largest for all permutations.
0.7
Latency

The first page of results contain the most relevant results.


0.6

0.5 9 We have intentionally taken the query: Ash Mohammad Abbas. We wish

0.4
to search for different permutations of a query and the effect of those
permutations on query space and on the number of relevant results. The
0.3 relevance is partly related to the intentions of an end-user. Since we already
know what are the relevant results for the chosen query, therefore, this is
0.2
1 1.5 2 2.5 3 3.5 4 easier to decide what relevant results out of them have been returned by a
Correlation
search engine. The reader may take any other query, if he/she wishes so. In
Fig. 11. Query Space versus correlation for random queries. that case, he has to decide what are the results that are relevant to his/her
query and this will partly depend upon what he/she intended to search.
UbiCC Journal - Volume 3 63
TABLE III
Q UERIES WITH EMBEDDED SEMANTICS .
S. No. Query No. Query Correlation
E1 Q1 node disjoint multipath 1
Q2 edge disjoint multicast
E2 Q1 node disjoint multipath routing 2
Q2 edge disjoint multicast routing
E3 Q1 node disjoint multipath routing 3
Q2 edge disjoint multipath routing
E4 Q1 node disjoint multipath routing ad hoc 4
Q2 wireless node disjoint multipath routing

TABLE IV
Q UERIES WITHOUT EMBEDDED SEMANTICS ( RANDOM QUERIES ).
S. No. Query No. Query Correlation
R1 Q1 adhoc node ergonomics 1
Q2 quadratic power node
R2 Q1 computer node constellations parity 2
Q2 hiring parity node biased
R3 Q1 wireless node parity common mitigate 3
Q2 mitigate node shallow rough parity
R4 Q1 few node parity mitigate common correlation 4
Q2 shallow mitigate node parity common stanza

TABLE V
S EARCH TIME AND Q UERY S PACE FO QUERIES WITH EMBEDDED SEMANTICS .
S. No. Query No. Google Yahoo
Time Query Space Time Query Space
E1 Q1 0.27 43100 0.37 925
Q2 0.23 63800 0.28 1920
E2 Q1 0.48 37700 0.40 794
Q2 0.32 53600 0.32 1660
E3 Q1 0.48 37700 0.40 794
Q2 0.24 21100 0.34 245
E4 Q1 0.31 23500 0.64 79
Q2 0.33 25600 0.44 518

TABLE VI
S EARCH TIME AND QUERY SPACE FOR RANDOM QUERIES .
S. No. Query No. Google Yahoo
Time Query Space Time Query Space
R1 Q1 0.44 28500 0.57 25
Q2 0.46 476000 0.28 58200
R2 Q1 0.46 34300 0.55 164
Q2 0.42 25000 0.35 90
R3 Q1 0.47 25000 0.40 233
Q2 0.33 754 0.68 31
R4 Q1 0.34 20000 0.58 71
Q2 1.02 374 0.64 23

Table II shows the same set of parameters for different number of relevant results. For permutation 2 (i.e. Ash
permutations of the given query for search portal Yahoo. From Abbas Mohammad), the second page contains the largest
the table, we observe that number of relevant results.
As opposed to Google, the query space does not remain Let us discuss reasons for the above mentioned observations.
same, rather it varies with the pages of searched results. Consider the question why query space in case of Google is
The query space in this case is less than Google. larger than that of Yahoo. We have pointed out that Google
The time to search the first page of results is not neces- is based on the page ranks. For a given query (or a set of
sarily the largest of the pages considered. More precisely, words), it ranks the pages. It delivers all the ranked pages
it is larger for the pages where there is no relevant result. that contain the words contained in the given query. On the
Further, the time taken by Yahoo is less than that of other hand, Yahoo is an ontology based search engine. As
Google. mentioned earlier, it will search only that part of its Webbase
In most of the cases, the first page contains the largest that constitutes the ontology of the given query. This is the
UbiCC Journal - Volume 3 64
TABLE VII TABLE IX
L ATENCY MATRIX , L, FOR DIFFERENT PERMUTATIONS . R ELEVANCE MATRIX FOR DIFFERENT PERMUTATIONS FOR G OOGLE .
P p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 P p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
1 1 1
1 1 0 0 1 1 1 1 1 8 5 1 0 2 0 0 3 0 0
2
2 0 0 0 0 1 0 1 0 0 0 2 3 2 2 1 1 0 2 0 0 0
3 0 1 0 0 0 0 0 0 0 0 3 6 4 1 3 2 1 1 0 0 0
4 0 1 0 1 0 1 0 1 0 0 4 3 0 2 1 0 0 2 0 1 1
5 0 1 0 0 0 0 0 0 0 1 5 3 2 1 2 1 0 0 1 1 1
6 0 0 1 1
0 0 0 0 0 1 6 5 4 1 3 0 2 1 2 2 0
2 2

TABLE VIII TABLE X


Query Space MATRIX , S, FOR DIFFERENT PERMUTATIONS . R ELEVANCE MATRIX FOR DIFFERENT PERMUTATIONS FOR YAHOO .
P p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
P p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
1 10 4 1 0 0 1 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1
2 4 6 1 1 0 1 1 1 0 0
2 1 1 1 1 1 1 1 1 1 1
3 10 3 1 2 0 0 0 0 0 0
3 1 1 1 1 1 1 1 1 1 1
4 7 4 0 2 1 0 0 1 0 1
4 1 1 1 1 1 1 1 1 1 1
5 8 5 1 1 0 0 0 0 0 1
5 1 1 1 1 1 1 1 1 1 1
6 10 5 0 0 0 1 0 0 0 0
6 1 1 1 1 1 1 1 1 1 1

reason why query space in case of Google is larger than that shown in Table IV. The words contained in these queries are
of Yahoo. random and are not related semantically.
Let us answer the question why query space changes in We wish to evaluate the performance of a search engine
case of Yahoo and why it remains constant in case of Google. for k-correlated queries. For that we evaluate search time and
Note that ontology may change with time and with order of query space of a search engine for the first page of results.
words in the given query. For every page of results, Yahoo Since both Google and Yahoo deliver 10 results per page,
estimates the ontology of the given permutation of the query therefore, looking for the first page of results means that we
before delivering the results to the end user. Therefore, the are evaluating 10 top most results of these search engines. Note
query space for different permutations of the given query is that we do not consider number of relevant results because
different and it changes with pages of the searched results10 . relevancy in this case would be query dependent. Since there
However, page ranks do not change either with pages or with is no single query, therefore, evaluation of relevancy would
order of words. The page ranks will only change when new not be so useful.
links or documents are added to the Web that are relevant to Table V shows search time and query space for k-correlated
the given query. Since neither a new link nor a new document queries with embedded semantics (see Table III). The second
is added to the Web during the evaluation of permutations of query, Q2 , is fired after the first query Q1 . On the other hand,
the query, therefore, the query space does not change in case Table VI shows search time and query space for k-correlated
of Google. queries whose words may not be related (see Table IV).
In order to compare the performance of Google and Yahoo,
the latencies versus page numbers for different permutations TABLE XI
of the query have been shown in Figures 2 through 7. Let us R ELEVANCE FOR DIFFERENT PERMUTATIONS .
consider the question why search time in case of Google is P Google Yahoo
larger than that of Yahoo. Note that Google ranks the results 1 19 16
before delivering them to end users while Yahoo does not. The 2 11 15
3 18 16
ranking of pages takes time. This is the reason why search time 4 10 16
taken by Google is larger than that of Yahoo. 5 12 16
In what follows, we discuss how a search engine behaves 6 20 16
Total 90 95
for correlated queries.

TABLE XII
B. Query Correlation Earned Points (EP) FOR DIFFERENT PERMUTATIONS .
We have formulated k-correlated queries as shown in Ta- P Google Yahoo
ble III. Since all words contained in a query are related11, Latency Query EP Latency Query EP
Space Space
therefore, we call them queries with embedded semantics. On
1 14 5 19 33 5 3 0 3
the other hand, we have another set of k-correlated queries as 2 3 11 14 14 0 14
3 4 18 22 13 0 13
10 This observed behavior may also be due to the use of a randomized
4 1 10 11 9 0 9
algorithm. To understand the behavior of randomized algorithms, readers are 5 3 12 15 10 0 10
referred to any text on randomized algorithms such as [10]. 6 25 20 22 5 16 0 16
11 More precisely, all words in these queries are from ad hoc wireless Total 118 65
networks, an area that authors of this paper like to work.
UbiCC Journal - Volume 3 65
TABLE XIII
In order to compare the performance of Yahoo and Google,
CEP FOR DIFFERENT PERMUTATIONS FOR G OOGLE .
the latencies versus correlation for queries with embedded
P Latency Query Space
semantics is shown in Figure 8 and that for randomized queries Contribution Contribution
is shown in Figure 9. Similarly, the query space for queries 1 123 834 5700000
with embedded semantics is shown in Figure 10 and that for 2 61 262 3300000
randomized queries is shown in Figure 11. 3 116 534 5400000
4 35 918 3000000
The query space of Yahoo is much less than that of Google 5 73 458 3600000
for the reasons discussed in the previous subsection. Other 6 530 376 6000000
Total 530 376 27000000
important observations are as follows.
In case of k-correlated queries with embedded semantics,
TABLE XIV
generally the time to search for Q2 is less than that of
CEP FOR DIFFERENT PERMUTATIONS FOR YAHOO .
Q1 .
P Latency Query Space
This is due to the fact that since the queries are correlated, Contribution Contribution
some of the words of Q2 have already been searched 1 101 385 419900
while searching for Q1 . 2 107 821 404100
The query space is increased when the given query has 3 131 558 431000
4 308 197 428700
a word that is more frequently found in Web pages (e.g. 5 130 833 425000
in R1: Q2 , the word quadratic that is frequently used 6 121 591 431500
in Engineering, Science, Maths, Arts, etc.). The query Total 901 385 2540200
space is decreased when there is a word included in
the query which is rarely used (e.g. mitigate included
in R3,R4:Q1  Q2 and shallow included in R3,R4:Q2). as follows.
The search time is larger in case of randomized queries
as compared to queries with embedded semantics.  1
1
if latency1i j  latency0i j
The reason for the this observation is as follows. In case li j 2 if latency1i j latency0i j (4)
of queries with embedded semantics, the words of a given 0 otherwise.

query are related and are found in Web pages that are not Similarly, let S si j  be a matrix where si j is defined as
too far from one another either from the point of view follows.
of page rank as in Google or from the point of view of
ontology as in Yahoo.  1
1
if space1i j  space0i j
si j if space1i j space0i j (5)
One cannot infer anything about the search time of 2
Google and Yahoo as it depends upon the query. More 0 otherwise.
precisely, it depends upon the fact which strategy takes In matrices defined above, where there is a ’1’, it means at
more time whether page rank in Google or estimation of that place Google is the winner and a ’ 12 ’ represents that there
ontology in Yahoo. has been a tie between Google and Yahoo. We now define a
However, from Table V and Table VI, one can infer the parameter that we call Earned Points (EP) which is as follows.
following. Google is better in the sense that its query space
 ∑  pages
  L


is much larger than that of Yahoo. However, Yahoo takes EPk relevantki k
i  Sik (6)
less time as compared to Google for different permutations i 1

of the same query. For k-correlated queries with embedded where, superscript k 0  1  denotes the search engine.
semantics, Google takes less time to search for the first query Table VII shows a latency matrix, L, for different permu-
as compared to Yahoo. It also applies to randomized queries tations of the query as that for Table I and Table II, and has
with some exceptions. In exceptional cases, Google takes been constructed using both of them. In the latency matrix,
much more time as compared to Yahoo. We have mentioned there are 40 ’0’s, 17 ’1’s, and 3 ’ 12 ’. We observe from the
it previously that it depends upon the given query as well as latency matrix that Yahoo is the winner (as far as latencies
the strategy employed in the search engine. are concerned), as there are 40 ’0’s out of 60 entries in total.
In what follows, we describe a unified criteria for comparing On the other hand, Table VIII shows the query space
the search engines considered in this paper. matrix, S, for different permutations of the same query and
is constructed using the tables mentioned in the preceding
paragraph. One can see that as far as query space is concerned,
V. A U NIFIED C RITERIA FOR C OMPARISON Google is the sole winner. Infact, query space of Google is
much larger than that of Yahoo.
Let us denote Google by  a superscript ’1’ and Yahoo by a The relevance matrix for Google is shown in Table IX and
superscript ’0’12 . Let L li j  be a matrix where li j is defined that for Yahoo is shown in Table X. The total relevance for the
first ten pages is shown in Table XI for both Google as well
12 This is simply a representation. One may consider a representation which as Yahoo. It is seen from Table XI that the total relevance for
is reverse of it, then also, there will not be any effect on the criteria. Google is 90 and that for Yahoo is 95. Average relevance per
UbiCC Journal - Volume 3 66
TABLE XV
Table XIII shows contributions of latency and query space
C ONTRIBUTION DUE TO QUERY SPACE IN CEP FOR DIFFERENT SETS OF
in CEP for Google. Similarly, Table XIV shows the same for
WEIGHTS .
Yahoo. We observe that contribution of latency for Google is
Weights
wl 1, wq 10
 6
Google
27 00
Yahoo
2 54
530  376 and that for Yahoo is 901  385. However, contribution
wl
wl
1,
1,
wq
wq
10
10
 5
4
270 00
2700 00
25 40
254 02
of query space for Google is 27000000 and that for Yahoo is
2540200. In other words, the contribution of query space for
wl
wl
1,
1,
wq
wq
10
10
 3
2
27000 00
270000 00
2540 20
25402 00
Google is approximately 11 times of that for Yahoo. Adding
these contributions shall result in a larger CEP for Google as
wl 1, wq 10 1 2700000 00 254020 00
wl 1, wq 1 27000000 2540200 compared to Yahoo. The CEP defined using (7) has a problem
that we call dominating constituent problem (DCP). The larger
parameter suppresses the smaller parameter. Note that the
TABLE XVI
definition of CEP in (7) assumes equal weights for latency
CCEP FOR DIFFERENT SETS OF comparable weights.
and query space. On the other hand, one may be interested in
Weights Google Yahoo assigning different weights to constituents of CEP depending
wl 0 9, wq 01 486 3384 811 2465 upon the importance of constituents. Let us rewrite (7) to
wl 0 8, wq 02 442 3008 721 1080
wl 0 7, wq 03 398 2632 630 9695 incorporate weights. Let wl and wq be the weights assigned to
wl 0 6, wq 04 354 2256 540 8310 latency and query space, respectively. The (7) can be written
wl 0 5, wq 05 310 1880 450 6925

as follows.
wl 0 4, wq 06 266 1504 360 5540
wl 0 3,
wl 0 2,
wq
wq
07
08
222 1128
178 0752
270 4155
180 2770 CEP k ∑
pages
relevantki   1
wl  qki wq  (8)
wl 0 1, wq 09 134 0376 90 1385 i 1 dik


The weights should be chosen carefully. For example, the
weights wl 1, wq 10 6 will add 27 to the contribution in
permutation and per page for Google is 1  5 and that for Yahoo

CEP due to query space for Google and 2  54 to Yahoo. On the
is 1  583. Therefore, as far as average relevance is concerned, other hand, a set of weights wl 1, wq 10 5 shall add 270
Yahoo is the winner. for Google and 25  4 for Yahoo. Table XV shows contribution
Table XII shows the number of earned points for both of query space in CEP for different sets of weights. It is to note


Google as well as Yahoo for different permutations of the that wl is fixed to 1 for all sets, and only wq is varied. As wq is
query mentioned earlier. We observe that the number of earned increased beyond 10 5, the contribution of query space starts


points for Google is 118 and that for Yahoo is 65. The number dominating over the contribution of latency. The set of weight
of earned points of Google is far greater than Yahoo. The wl 1  wq 10 5 indicates that one can ignore contribution
reason behind this is that query space of Yahoo is always less of query space in comparison to the contribution of latencies
than that of Google and it does not contribute to the number provided that one is more interested in comparing search
of earned points. engines with respect to latency. In that case, an approximate
A closer look on the definition of EP reveals that while expression for CEP can be written as follows.


∑
defining the parameter EP in (6) together with (4) and (5),

pages
we have assumed that a search engine either has a constituent
parameter (latency or query space) or it does not have that
CEP k
i 1
relevantki
1
dik  (9)

parameter at all. The contribution of some of the parameter Alternatively, one may consider an approach that is combi-
is lost due the fact that the effective contribution of other nation of the definition of EP defined in (6) (together with (4)
parameter by which the given parameter is multiplied is zero. and (5) and that of CEP defined in (7). In that we may use
Note that our goal behind introduction of (6) was to rank the definition of matrix S which converts the contribution of
the given set of search engines. We call this type of ranking query space in the form of binaries13 . The modified definition
of search engines as Lossy Constituent Ranking (LCR). We,

is as follows.
therefore, feel that there should be a method of comparison
between a given set of search engines that is lossless in nature.
CCEPk ∑pages
relevantki   1
 Sik  (10)
For that purpose, we define another parameter that we call i 1 dik
Contributed Earned Points (CEP). The definition of CEP is as
follows. where Sik is in accordance with the definition of S given by
 (5). The acronym CCEP stands for Combined Contributory

CEP k  ∑ pages
relevantki  1
qki  (7)
Earned Points. If one wishes to incorporate weights, then the


 definition of CCEP becomes as follows.
dik
 ∑ 
i 1

 
pages
where, superscript k 0  1  denotes the search engine, d
denotes the actual latency, and q denotes the actual query
CCEPk
i 1
relevantki
1
dik
wl  Sik wq  (11)

space. The reason behind having an inverse of actual latency 13 We mean that the matrix S says either there is a contribution of query
in (7) is that the better search engine would be that which space of a search engine provided that its query space is larger than that of
takes less time. the other one or there is no contribution of query space at all, if otherwise.
UbiCC Journal - Volume 3 67
In the definition of CCEP given by (11) the weights can be weights to different constituents of the criteria—latency and
comparable and the dominant constituent problem mentioned query space. Our observations are as follows.
earlier can be mitigated for comparable weights. We define We observed that performance of Yahoo is better in terms
comparable weights as follows. 

of the latencies, however, Google performs better in terms
Definition 3: A set of weights W wi   wi 0, is said of query space.
to have comparable weights if and only if ∑i wi 1 and the We discussed the dominant constituent problem. We
condition 19  wwij  9 is satisfied wi  w j W . discussed that this problem can be mitigated using the
Table XVI shows the values of CCEP for different sets of concept of contributory earned points if weights assigned
comparable weights. We observe that the rate of decrease of to constituents are comparable. If both the constituent
CCEP for Yahoo is larger than that of Google. For example, for are assigned equal weights, we found that Yahoo is the
wl 0  9  wq 0  1, CCEP for Google is 486  3384 and that for winner.
Yahoo is 811  2465. For wl 0  8  wq 0  2, CCEP for Google However, the performance of a search engine may depend
is 442.3008 and that for Yahoo is 721  1080. In other words, upon the criteria itself and only one criteria may not be
the rate of decrease in CCEP for Google is 9  05% and that for sufficient for an exact analysis of the performance. Further
Yahoo is 11  11%. The reason being that in the query space investigations and improvements in this direction forms our
matrix, S, (see Table VIII) all entries are ’1’. It means that future work.
query space of Google is always larger than that of Yahoo.
Therefore, in case of Yahoo, the contribution due to query R EFERENCES
space is always 0 irrespective of the weight assigned to it.
[1] S. Malhotra, ”Beyond Google”, CyberMedia Magazine on Data Quest,
However, in case of Google the contribution due to query space vol. 23, no. 24, pp.12, December 2005.
is nonzero and increases with an increase in weight assigned [2] M.R. Henzinger, A. Haydon, M. Mitzenmacher, M. Nozark, ”Measuring
 due to query space. Moreover, for a set of
to the contribution Index Quality Using Random Walks on the Web”, Proceedings of 8th
weights, W wl 0  5  wq 0  5  , the values of CCEP are International World Wide Web Conference, pp. 213-225, May 1999.
[3] M.C. Tang, Y. Sun, ”Evaluation of Web-Based Search En-
310  1880 and 450  6925 for Google and Yahoo, respectively. gines Using User-Effort Measures”, Library an Information Sci-
It means that if one wishes to assign equal weights to latency ence Research Science Electronic Journal, vol. 13, issue 2, 2003,
http://libres.curtin.edu.au/libres13n2 /tang.htm.
and query space then Yahoo is the winner in terms of the [4] C.W. Cleverdon, J. Mills, E.M. Keen, An Inquiry in Testing of Infor-
parameter CCEP. mation Retrieval Systems, Granfield, U.K., 1966.
In case of CCEP, the effect of the dominating constituent [5] J. Gwidzka, M. Chignell, ”Towards Information Retrieval Mea-
sures for Evaluation of Web Search Engines”, http://www.imedia.
problem is less as compared to that in case of CEP. In other mie.utoronto.ca/people/jacek/pubs/webIR eval1 99.pdf, 1999.
words, the effect of large values of query space is fairly smaller [6] D. Rafiei, A.O. Mendelzon, ”What is This Page Known For: Computing
in case of CCEP as compared to that in case of CEP. This is Web Page Reputations”, Elsevier Journal on Computer Networks, vol
33, pp. 823-835, 2000.
with reference to our remark that with the use of CCEP the [7] N. Bhatti, A. Bouch, A. Kuchinsky, ”Integrating User-Perceived Quality
dominating constituent problem is mitigated. into Web Server Design”, Elsevier Journal on Computer Networks, vol
33, pp. 1-16, 2000.
[8] S. Brin, L. Page, ”The Anatomy of a Large-Scale Hypertextual Web
VI. C ONCLUSIONS Search Engine”, http://www-db.stanford.edu/pub/papers/google.pdf,
In this paper, we analyzed the impact of correlation among 2000.
[9] J. Kleinberg, ”Authoritative Sources in a Hyperlinked Environment”,
queries on search results for two representative search portals Proceedings of 9th ACM/SIAM Symposium on Discrete Algorithms,
namely Google and Yahoo. The major accomplishments of the 1998.
paper are as follows: [10] R. Motwani, P. Raghavan, Randomized Algorithms, Cambridge Univer-
sity Press, August 1995.
We analyzed the search time, the query space and the
number of relevant results per page for different per-
mutations of the same query. We observed that these
parameters vary with pages of searched results and are
different for different permutations of the given query.
We analyzed the impact of k-correlation among two
subsequent queries given to a search engine. In that
we analyzed the search time and the query space. We
observed that
– The search time is less in case of queries with
embedded semantics as compared to randomized
queries without any semantic consideration.
– In case of randomized query, the query space is
increased in case the given query includes a word
that is frequently found on the Web and vice versa.
Further, we considered a unified criteria for comparison be-
tween the search engines. Our criteria is based upon the
concept of earned points. An end-user may assign different
UbiCC Journal - Volume 3 68
ARTIFICIAL IMMUNE SYSTEMS FOR ILLNESSES DIAGNOSTIC

Hiba Khelil, Abdelkader Benyettou


SIMPA Laboratory – University of Sciences and Technology of Oran,
PB 1505 M’naouer, 31000 Oran, Algeria
[email protected], [email protected]

ABSTRACT
Lately, a lot of new illnesses are frequently observed in our societies, that it can be
avoid by daily visits to the doctor. Cancer is one of these illnesses where patients
discover it only when it is too late. In this work we propose an artificial Cancer
diagnostic which can classify patients if they are affected by Cancer or no, for this
goal we have developed the artificial immune system for Cancer diagnostic. The
artificial immune system is one of the newest approaches used in several domains
as pattern recognition, robotic, intrusion detection, illnesses diagnostic…a lot of
methods are exposed as negative selection, clone selection and artificial immune
network (AINet). In this paper we’ll present the natural immune system, we’ll
develop also four versions of Artificial Immune Recognition System (AIRS) and
after we’ll present results for Cancer diagnostic with some critics and remarks of
these methods.

Keywords: Antigen, Antibody, B memory cells, Artificial Recognition Ball


(ARB), Artificial Immune Recognition System (AIRS), Cancer diagnostic.

1 INTRODUCTION of the natural immune system and immune response


types. The second part is a representation of an
Pattern recognition is very vast domain in the artificial simulation of the immune systems giving a
artificial intelligence, where we can find faces, prints, description of training algorithms. As prototype of
speech and hand writing recognition and others training we’ll present a preview of Cancer data bases
patterns that are not less important as ones and results of application of artificial immune system
mentioned; for this goal several approaches are for Cancer diagnostic. Finally some critics are given
developed as neuronal networks, evolutionary to show limits and differences between methods and
algorithms, genetic algorithms and others under some perspectives also.
exploitation. The artificial immune system is a new
approach used in different domains as pattern 2 NATURAL IMMUNE SYSTEM
recognition [1] [2] [3] [4] [5], intrusions detection in
Internet networks [6], robotics [7], machine learning The biologic immune system constitutes a
[8] and other various applications in different weapon against intruders in a given body, for this
domains. The artificial immune functions are goal several cells contribute to eliminate this intruder
inspired from natural immune system where the named antigen, these cells participate for a 'biologic
responsible cells of the immune response are immune response'.
simulated to give an artificial approach adapted We distinguish two types of natural immune
according the application domain and the main response one is innate and other is acquired
problem. explained in the following points:
The present work is an application of the
artificial immune recognition system (AIRS) for 2.1 Innate Immunity
Cancer diagnostic. AIRS is a method inspired from It is an elementary immune that very reduce
the biologic immune system for pattern recognition number of antigens are used, we can find this type of
(classification) proposed by A. Watkins in 2001 [9] immunity in newborn not yet vaccinated. A non
in his Master thesis to Mississipi university, the adaptive immune for long time can drive infections
improvement was been in 2004 by A. Watkins, J., and death because the body is not very armed against
Timmis and L. Boggess [10] where authors optimize antigens of the environment [12].
the number of B cells generated. This method is
characterized by the distributed training proven by A. 2.2 Innate Acquired
Watkins in his PHD to the university of Kent in 2005 It is an immune endowed with a memory, named
[11]. In this paper we will begin by a short definition also secondary response, triggered after the

UbiCC Journal - Volume 3 69


apparition of the same antigen in the same immune was chosen from MC which has the least value of
system for the second time or more, where it affinity (maximize the stimulation) with this antigen,
generates the development of B cells memory for noting that:
this type of antigen already met (memorized) in the stimulation ( ag , mc ) = 1- affinity ( ag , mc ) (2)
system. This answer is faster than innate one [12] This cell will be conserved for a long time by
and caused the increase of the temperature of the cloning and generate new ARBs, these ARBs will be
body, which can be explained by fighting of B cells add to the old ARBs set; the clone number is
against antigens. calculated by formula (3)
The primary immune response is only slower but
clone _ number = hyper _ clonal _ rate*clonal _ rate*stimulation(mcmatch, ag)
it keeps information about passage of antigens in the
system; this memorization phenomenon will interest (3)
us to use it for the artificial pattern recognition; Every clone is mutating according a small
according this principle the artificial immune algorithm described in [9], which consist to alter the
recognition system is developed and it is the main characteristic cells vectors.
subject of this paper.
4.3 Competition for Resources and Development
3 THE ARTIFICIAL IMMUNE SYSTEM of a Candidate Memory Cell
In this step, ARBs’s information was completed
The natural immune system is very complicated by calculating resources allocations in function of its
to be artificially simulated; but A. B. Watkins stimulation as following:
succeeded to simulate the most important functions resource = stimulation ( ag , ARB ( antibody )) * clonal _ rate
of the natural immune system for pattern recognition. (4)
The main factors entering in the artificial immune and calculate the average stimulation for each ARB
system are antigens, antibodies, B memory cells. also. This step cans death some ARBs which are low
We’ll present in the next session training algorithms stimulated.
that puts in work the noted factors (antibodies, B After we clone and mutate the subset of ARBs
memory cells and antigens). according their stimulation level.
While the average stimulation value of each
4 THE AIRS ALGORITHM ARB class (si) is less then a given stimulation
thresholds then we repeat the third step.
The present algorithm is inspired from A. B. | A Bi |
∑ a b j .s tim
Watkins thesis [9] [13] [14] which present the j =1 (5)
si = , abj ∈ A Bi
artificial immune recognition system intended for | ABi |
pattern recognition named AIRS. First, antigens
represent the training data used in the training s i ≥ a ffin ity _ th r e s h o ld (6)
program in order to generate antibodies (B cells) to
be used in the test step (classification). We can note
4.4 Memory Cell Introduction
that there are four training steps in the artificial
Select ARBs of the same class as the antigen
immune training algorithm as following:
with highest affinity. If the affinity of this ARB with
m c m a tch
4.1 Initialization Step the antigenic pattern is better than that of
In this step, all characteristic vectors of antigens mc
then add the candidate cell ca n d id a te to the
are normalized, and affinity threshold is calculated memory cells set. Additionally, if the affinity of
by (1) m c m a tch m c ca n d id a te
n−1 n and is below the affinity
∑ ∑ affinity(agi , ag j ) threshold, then remove
m c m a tch
from memory set.
i=1 j =i+1
affinity _ threshold = (1) So we repeat the second step until all antigens
n(n − 1) was treated.
2 After the end of training phase, test is executed
using the memory cells generated from training step,
Noting here that affinity is the Euclidian distance in order to classify the new antigenic patterns. The
between two antigens, and n is the number of critter of classification is to attribute the new antigen
antigens (cardinality of training data) to the most appropriate class using KMeans or KNN
To begin, we must initialize the B cells memory (K Nearest Neighbor); in this paper we’ll present
set (MC) and ARBs population by choosing arbitrary classification results using KMeans algorithm.
examples from training data.
4.2 B Cells Identification and ARBs Generation
Ones initialization was finished; this step is
executed for each antigen from training data. First,

UbiCC Journal - Volume 3 70


5 THE AIRS2 ALGORITHM 7 RESULTS

The changes made to the AIRS algorithm are To determine the relative performance of AIRSs
small, but it offers simplicity of implementation, data algorithms, it was necessary to test it on data base; so
reduction and minimizes the processing time. we have chosen three Cancer data bases from
The AIRS2 training steps are the same as AIRS hospitable academic center of Wisconsin: Brest
one, just some changes which are presented as Cancer Wisconsin (BCW), Wisconsin Prognostic
following: Breast Cancer (WPBC) and Wisconsin Diagnostic
1- It’s not necessary to initialize the ARB set. Breast Cancer (WDBC). The description of this data
2- It’s not necessary to mutate the ARBs class bases are given as following:
feature, because in AIRS2 we are interesting only - Brest Cancer Wisconsin (BCW): This data base
about cells of the same class of antigen. was obtained from hospitable academic center of
3- Resources are only allocated to ARBs of the same Wisconsin in 1991, which describe the cancerous
class as antigen and are allocated in proportion to the symptoms and classify them into two classes:
ARB’s stimulation level in reaction to the antigen. ‘Malignant’ or ‘Benin’. The distribution of patients
4- The training stopping criterion no longer takes is given as following: (Malignant, 214) (Benin, 458).
into account the stimulation value of ARBs in all - Wisconsin Prognostic Breast Cancer (WPBC): This
classes, but only accounts for the stimulation value data base is conceived by the same hospitable
of the ARBs of the same class as the antigen. academic in 1995 but it gives more details then BCW
giving nucleus of cell observations. Basing of its
6 AIRS AND AIRS2 ALGORITHMS USING characteristics, patients are classify into two classes:
MERGING FACTOR ‘Recur’ and ‘NonRecur’, where its distribution as
following: (Recur, 47) (NonRecur, 151).
In this session we’ll present other modification - Wisconsin Diagnostic Breast Cancer (WDBC):
of the AIRS and AIRS2. This modification carries on This data base is conceived also by the same
the last training step (Memory cell introduction), hospitable academic in 1995, it has the same attribute
mainly in the cell introduction criterion; the then WPBC, but it classify its patients into two
condition was as following: classes: ‘Malignant’ and ‘Benin’, where its
C a n d S tim ← S tim u la tio n ( a g , m c ca n d id a te ) distribution as following: (Malignant, 212) (Benin,
M a tch S tim ← S tim u la tio n ( a g , m c m a tch ) 357).
(7) All training data are antigens, represented by
C ellA ff ← a ffin ity ( m c ca n d id a te , m c m a tch )
characteristic vectors; also for antibodies have the
if ( C a n d S tim > M a tch S tim ) same characteristic vector size as antigens. The ARB
if ( C ellA ff < A T * ATS ) is represented as structure having antibody
M C ← M C − m c m a tch characteristic vector, his stimulation with antigen and
M C ← M C ∪ m c ca n d id a te
the resources that allowed.

7.1 Software and Hardware Resources


This source explains conditions to add In order to apply algorithms we have used the
m c ca n d id a te mc
and delete m a tch to the memory C++ language in Linux Mandriva 2006 environment,
cells set; the modification is carried in the following every machine is endowed of 512Mo memory space
condition: and 3.0 Ghz processor frequency.
if (CellAff < AT * ATS + factor ) (8) All training programs have the same number of
Noting that factor is calculated by: antigens, the same number of initial memory cells
and ARBs also. In the same way the training
factor = AT * ATS * dampener * log( np)
program is an iterative process; that we have fix 50
(9) iterations for each training program of every class.
With ATS and dampener are two parameters
between 0 and 1, and np is the number of training 7.2 Results and Classification Accuracy
programs executed in parallel (number of classes). To run programs, we must fix the most
This change to the merging scheme relaxes the important training parameters as following:
criterion for memory cell removal in the affinity hyper _ clonal _ rate clonal _ rate
, and
based merging scheme by a small fraction in mutation _ rate . These parameters are used in
logarithmic.
This modification is used in the two algorithms training steps, as criterion to limit the clone number,
(AIRS and AIRS2), and all algorithms are applied to calculate the ARB’s resources and in the mutation
for Cancer diagnostics and all results we’ll be procedure also. The parameters values are given in
presented in the next sessions. table 1:

UbiCC Journal - Volume 3 71


Table 1: Training parameters.
After 50 iterations, the B memory cells
generated from each training program of classes are
Parameters Type Values used in the classification step (test). In the
classification we take the shortest distance between
the new antigen and the gravity centers of all
memory cells sets, and we affect this antigen in the
Hyper_clonal_rate Integer value 30 same class of the nearest center (KMeans). Using
this principle the classification accuracies are given
Clonal_rate Integer value 20 in table 2:

Mutation_rate Real value 0.1


[0,1]

Table 2: Classification accuracies

Table 3: Average classification accuracies

UbiCC Journal - Volume 3 72


We can observe that AIRS gives in general the
best results, which can give more B cells then to
increase the recognition chance. The AIRS2 and
AIRS2 using factor are amelioration of AIRS, but B
cells generated are less than the original algorithm
(AIRS), that’s why these methods (AIRS2, AIRS2
with factor) don’t give better results in Cancer
diagnostic in exception of some cases; the evolution
of B cells is given in the next session.

The execution of AIRS2 and AIRS2 using factor


are faster than AIRS and AIRS using factor, this can
be explained by using just B cells of the same class Figure 3: Evolution of B cells in WDBC (AIRS2)
of antigen, and this can reduce treatments then time
processing also. From figures we observe that evolution of B
cells in AIRS are faster than AIRS2 and AIRS2
7.3 B Cells Evolution using factor because there are more cells deleted on
Noting that we have given the same chance to changing the condition given in equation (8) (using
each training programs (quantity of antigens factor).
introduced, initial memory cells and initial ARBs),
the B cells generated are not necessary the same in 8 DISCUSSION OF RESULTS
each method; the previews table give us the B cells
generated in each one, we can observe that the cells The results of experiences can be found in table
generated in AIRS2 and AIRS2 using factor are less 2 and 3; comparing the used methods we can observe
than AIRS and AIRS using factor, as mentioned that AIRS in general gives best results and we can
before, although we initialized all B cells sets to the observe also that this method generates B cells more
same size. The next figures represent evolution of B than others.
cells in function of iterations for each data base of In all experiences we have used the Euclidian
the best rate from four methods: distance, it is possible to use hamming distance or
other.
The AIRS and AIRS using factor converge
slowly to the most B cells adapted for Cancer
diagnostic, on the contrary of AIRS2 and AIRS2
using factor, which are executed quickly than others
but they don’t give us the best memory cells.

9 CONCLUSIONS

In this paper we have presented Cancer


diagnostic results for AIRS immuno-computing
algorithm and provided directions for interpretation
of these results. We are interested in immuno-
Figure 1: Evolution of B cells in BCW (AIRS) computing because is one of the newest directions in
bio-inspired machine learning and focused on AIRS,
AIRS2, AIRS using factor and AIRS2 using factor
(2001-2005) and it can be also used for classification
(illnesses diagnostic).
We suggest that AIRS is a mature classifier that
delivers reasonable results and that is can safety be
used for real world classifications tasks. The
presented results are good but it must be improved
using optimization algorithms. In our future work we
want to make more importance to the parameters
values and propose a new method to search the best
values of these ones in order to across the
Figure 2: Evolution of B cells in WPBC (AIRS2 performance of these algorithms.
using factor)

UbiCC Journal - Volume 3 73


10 REFERENCES Man and Cybernetics, vol. 6, pp. 614--619. IEEE Press,
New York (1999)
[1] Secker A., Freitas A., Timmis J.: AISEC: An artificial [8] Timmis J.: Artificial immune systems: a novel data
immune system for e-mail classification. In proceedings analysis technique inspired by the immune network
of the Congress on Evolutionary Computation, pp. 131- theory, PhD thesis Wales UK University (2000)
-139, Canberra. Australia (2003) [9] Watkins A.: AIRS: A resource limited artificial
[2] Lingjun M., Peter V. D. P., Haiyang W., A: immune classifier, Master thesis, Mississippi University
Comprehensive benchmark of the artificial immune (2001)
recognition system (AIRS). In proceeding of advanced [10] Watkins A., Timmis J., Boggess L.: Artificial immune
data mining and applications ADMA, recognition system (airs): an immune inspired
vol. 3584, pp. 575--582, China (2005) supervised learning algorithm, vol. 5, pp. 291--317,
[3] Deneche A., Meshoul S., Batouche M. : Une approche Genetic Programming and Evolvable Machines press
hybride pour la reconnaissance des formes en utilisant (2004)
un systeme immunitaire artificiel. In proceedings of [11] Watkins A.: Exploiting immunological metaphors in
graphic computer science, Biskra, Algeria (2005) the development of serial, parallel, and distributed
[4] Deneche A.: Approches bios inspirees pour la learning algorithms, PhD thesis, Kent University (2005)
reconnaissance de formes, Master thesis in Mentouri [12] Emilie P.: Organisation du system immunitaire felin,
University, Constantine, Algeria (2006) PhD thesis, National school Lyon, France (2006)
[5] Goodman D., Boggess L., Watkins A.: Artificial [13] Watkins A., Timmis J.: Artificial immune recognition
immune system classification of multiple class system (airs): revisions and refinements. In proceedings
problems, Intelligent Engineering Systems Through of first international conference on artificial immune
Artificial Neural press (2002) system ICARIS, pp. 173--181, Kent University (2005)
[6] Kim J., Bently P.: Towards an artificial immune system [14] Watkins A., Boggess L.: A new classifier based
for network intrusion detection: an investigation of on resources limited artificial immune systems. In
clonal selection with a negative selection operator. In proceedings of congress de Evolutionary
Proceeding of Congress on Evolutionary Computation, Computation, IEEE World Congress on
vol. 2, pp. 1244--1252, South Korea (2001) Computational Intelligence held in Honolulu, HI,
[7] Jun J. H., Lee D. W., Sim K. B.: Realization of USA, pp. 1546--1551, Kent University (2005)
cooperative and swarm behavior in distributed
autonomous robotic systems using artificial immune
system. In proceeding IEEE international conference of

UbiCC Journal - Volume 3 74


HOT METHOD PREDICTION USING SUPPORT VECTOR MACHINES

Sandra Johnson ,Dr S Valli


Department of Computer Science and Engineering, Anna University, Chennai – 600 025, India.
[email protected] , [email protected]

ABSTRACT
Runtime hot method detection being an important dynamic compiler optimization
parameter, has challenged researchers to explore and refine techniques to address
the problem of expensive profiling overhead incurred during the process. Although
the recent trend has been toward the application of machine learning heuristics in
compiler optimization, its role in identification and prediction of hot methods has
been ignored. The aim of this work is to develop a model using the machine
learning algorithm, the Support Vector Machine (SVM) to identify and predict hot
methods in a given program, to which the best set of optimizations could be
applied. When trained with ten static program features, the derived model predicts
hot methods with an appreciable 62.57% accuracy.

Keywords: Machine Learning, Support Vector Machines, Hot Methods, Virtual


Machines.

1 INTRODUCTION results of the evaluation. Section 7 proposes future


work and concludes the paper.
Optimizers depend on profile information to
identify hot methods of program segments. The 2 RELATED WORK
major inadequacy associated with the dynamic
optimization technique is the high cost of accurate Machine learning techniques are currently used
data profiling via program instrumentation. The to automate the construction of well-designed
major challenge is how to minimize the overhead individual optimization heuristics. In addition, the
that includes profile collection, optimization strategy search is on for automatic detection of a program
selection and re-optimization. segment for targeted optimization. While no previous
While there is a significant amount of work work to the best of our knowledge has used ML for
relating to cost effective and performance efficient predicting program hot spots, this section reviews the
machine learning (ML) techniques to tune individual research papers which use ML for compiler
optimization heuristics, relatively little work has optimization heuristics.
been done on the identification and prediction of In a recent review of research on the challenges
frequently executed program hot spots using confronting dynamic compiler optimizers, Arnold et
machine learning algorithms so as to target the best al. [1] give a detailed review of adaptive
set of optimizations. In this study it is proposed to optimizations used in the virtual machine
develop a machine learning based predictive model environment. They conclude that feedback-directed
using the Support Vector Machine (SVM) classifier. optimization techniques are not well used in
Ten features have been derived from the chosen production systems.
domain knowledge, for training and testing the Shun Long et al. [3] have used the Instance-
classifiers. The training data set are collected from based learning algorithm to identify the best
the SPEC CPU2000 INT and UTDSP benchmark transformations for each program. For each
programs. The SVM classifier is trained offline with optimized program, a database stores the
the training data set and it is used in predicting the transformations selected, the program features and
hot methods of a program which are not trained. This the resulting speedup. The aim is to apply
system is evaluated for the program’s hot method appropriate transformations when a new program is
prediction accuracy. encountered.
This paper is structured as follows. Section 2 Cavazos et al. [4] have applied an offline ML
discusses related work. Section 3 gives a brief technique to decide whether to inline a method or not.
overview of Support Vector Machines. In Section 4 The adaptive system uses online profile data to
this approach and in section 5 the evaluation identify “hot methods” and method calls in the hot
methodology is described. Section 6 presents the methods are in-lined using the ML heuristics.

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 1 75
Cavazos et al. [5, 12] have also used supervised machine learning to identify the best procedure clone
learning to decide on which optimization algorithm for the current run of the program. M. Stephenson et
to use: either graph coloring or Linear scan for al. [18] have used two machine learning algorithms,
register allocation. They have used three categories the nearest neighbor (NN) and Support Vector
of method level features for ML heuristics (i.e.) Machines (SVMs), to predict the loop unroll factor.
features of edges of a control flow graph, features None of these approaches aims at prediction at the
related to live intervals and finally, statistical method level. However, machine learning has been
features about the size of a method. widely used in work on branch prediction [21, 22, 23,
Cavazos et al. [11] report that the best of 24].
compiler optimizations is method dependent rather
than program dependent. Their paper describes how, 3 SUPPORT VECTOR MACHINES
logistic regression-based machine learning technique
trained using only static features of a method, is used The SVM [15, 16] classification maps a training
to automatically derive a simple predictive model data (xi,yi), i = 1,…,n where each instance is a set of
that selects the best set of optimizations for feature values xi ∈ Rn and a class label y ∈ {+1,-1},
individual methods within a dynamic compiler. They into a higher-dimensional feature space φ(x) and
take into consideration the structures of a particular defines a separating hyperplane. Only two types of
method within a program to develop a sequence of data can be separated by the SVM which is a binary
optimization phases. The automatically constructed classifier. Fig. 1 shows a linear SVM hyperplane
regression model is shown to out-perform hand- separating two classes.
tuned models. The linear separation in the feature space is done
To identify basic blocks for instruction using the dot product φ(x).φ(y). Positive definite
scheduling Cavazos et al. [20] have used supervised kernel functions k(x, y) correspond to feature space
learning. Monsifrot et al. [2] have used a decision dot products and are therefore used in the training
tree learning algorithm to identify loops for unrolling. algorithm instead of the dot product as in Eq. (1):
Most of the work [4, 5, 11, 12, 20] is implemented
and evaluated using Jikes RVM. k ( x, y ) = (Φ ( x) • Φ ( y )) (1)
The authors [8, 19] have used genetic
programming to choose an effective priority function
which prioritizes the various compiler options The decision function given by the SVM is given in
available. They have chosen hyper-block formation, Eq. (2):
register allocation and data pre-fetching for
n
evaluating their optimizations.
Agakov et al. [9] have applied machine learning f ( x ) = ∑ vi .k ( x, xi ) + b (2)
i =1
to speed up search-based iterative optimization. The
statistical technique of the Principal component
where b is a bias parameter, x is the training example
analysis (PCA) is used in their work for appropriate
and vi is the solution to a quadratic optimization
program feature selection. The program features
problem. The margin of separation extending from
collected off-line from a set of training programs are
the hyperplane gives the solution of the quadratic
used for learning by the nearest neighbor algorithm.
optimization problem.
Features are then extracted for a new program and
are processed by the PCA before they are classified,
using the nearest neighbor algorithm. This reduces Optimal Hyperplane
the search space to a few good transformations for
the new program from the various available source-
level transformations. However, this model can be Margin of Separation
applied only to whole programs.
The authors [10] present a machine learning-
based model to predict the performance of a
modified program using static source code features
and features like execution frequencies of basic
blocks which are extracted from the profile data
collected. As proposed in their approach [9], the Feature Space

authors have used the PCA to reduce the feature set. Figure 3: Optimal hyperplane and margin of
A linear regression model and an artificial neural separation
network model are used for building the prediction
model which is shown to work better than the non- 4 HOT METHOD PREDICTION
feature-based predictors.
In their work Fursin et al. [14] have used This section briefly describes how machine

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 2 76
learning could be used in developing a model to Machine’s (LLVM) [6] bytecode representation of
predict hot methods within a program. A discussion the programs provides the training as well as the test
of the approach is followed by the scheme of the data set. The system architecture for the SVM-based
SVM-based strategy adopted in this study. hot method predictive model is shown in Fig.2 and it
closely resembles the architecture proposed by the
authors C. L. Huang et. al. [26]. Fig. 3 outlines the
strategies for building a predictive model.

1. Create training data set.


a. Collect method level features
i. Calculate the specified feature for every
method in a LLVM bytecode.
ii. Store the feature set in a vector.
b. Label each method
i. Instrument each method in the program
with a counter variable [25].
ii. Execute the program and collect the
frequency of the execution of each
method.
iii. Using the profile information, each
method is labeled as either hot or cold.
iv. Write the label and its corresponding
feature vector for every method in a file.
c. Steps (a) & (b) are repeated for as many
programs as are required for training.
2. Train the predictive model.
a. The feature data set is used to train the
SVM-based model.
b. The predictive model is generated as output.
3. Create test data set.
a. Collect method level features.
i. Calculate the specified features for every
method in a new program.
ii. Store the feature set in a vector.
iii. Assign the label ‘0’ for each feature
vector in a file.
4. Predict the label as either hot or cold for the test
data generated in step 3 using the predictive
model derived in step 2.

Figure 3: System outline

4.2 Extracting program features


The ‘C’ programs used for training are converted
Figure 2: System architecture of the SVM-based hot into LLVM bytecodes using the LLVM frontend.
method predictive model Every bytecode file is organized into a single module.
Each module contains methods which are either user-
defined or pre-defined. Only static features of the
4.1 The approach
Static features of each method in a program are user-defined methods are extracted from the
collected by offline program analysis. Each of these bytecode module, for the simple reason that they can
method level features forms a feature vector which is be easily collected by an offline program analysis.
labeled either hot or cold based on classification by a Table 1 lists the 10 static features that are used to
prior execution of the program. The training data set train the classifier. Each feature value of a method is
thus generated is used to train the SVM-based calculated in relation to an identical feature value
predictive model. Next, the test data set is created by extracted from the entire bytecode module. The
offline program analysis of a newly encountered collection of all the feature values for a method
program. The trained model is used to predict constitutes the feature vector xi. This feature vector
whether a method is hot or cold for the new program. xi is stored for subsequent labeling. Each feature
An offline analysis on the Low Level Virtual vector xi is then labeled yi and classified as either hot

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 3 77
(+1) or cold (-1) based on an arbitrary threshold feature 1 indicates the percent of loops found in the
scheme described in the next section. method. The “hot method threshold” used being 50%,
4 out of the 8 most frequently executed methods in a
Table 1: static features for identifying hot methods. program are designated as hot methods. The first
element in each vector is the label yi (+1 or -1). Each
1. Number of loops in a method. element of the feature vector indicates the feature
Average loop depth of all the loops in the number followed by the feature values.
2.
method.
3. Number of top level loops in a method. 4.4 Creating test data set
Number of bytecode level instructions in When a new program is encountered, the test
4. data set is collected in a way similar to the training
the method.
5. Number of Call instructions in a method. data set, except that the label is specified as zero.
6. Number of Load instructions in a method.
7. Number of Store instructions in a method. 0 1:1 2:1 3:1 4:1.13098 5:2.91262 6:2.05479
Number of Branch instructions in the 7:1.09091 8:1.34875 9:1.55172 10:34
8. 0 1:0 2:0 3:0 4:0.552341 5:0.970874 6:1.14155
method.
9. Number of Basic Blocks in the method. 7:0.363636 8:0.385356 9:0.862069 10:4
0 1:1 2:1 3:1 4:1.26249 5:0 6:2.51142 7:2.90909
10. Number of call sites for each method.
8:1.15607 9:1.2069 10:40
4.3 Extracting method execution frequencies
Figure 5: Sample test data set
Hot methods are frequently executing program
segments. To identify hot and cold methods within a
4.5 Training and prediction using SVM
training program, profile information is gathered
Using the training data set file as input , the
during execution. The training bytecode modules are
machine learning algorithm SVM is trained with
instrumented with a counter variable in each user-
default parameters (C-SVM, C=1, radial base
defined method. This instrumented bytecode module
function). Once trained the predictive model is
is then executed and the execution frequency of each
generated as output. The derived model is used to
method is collected. Using this profile information,
predict the label for each feature vector in the test
the top ‘N’ most frequently executed methods are
data set file. The training and prediction are done
classified as hot. This system keeps the value ‘N’ as
offline. Subsequently, the new program used for
the “hot method threshold”. In this scheme of
creating test data set is instrumented. Executing this
classification, each feature vector (xi) is now labeled
instrumented program provides the most frequently
yi (+1) and yi (-1) for hot and cold methods
executed methods. The prediction accuracy of the
respectively. The feature vector (xi) along with its
system is evaluated by comparing the predicted
label (yi) is then written into a training dataset file.
output with the actual profile values.
Similarly, the training data set of the different
training programs is accumulated in the file. This file
5 EVALUATION
is used as an input to train the predictive model.
5.1 Method
+1 1:1 2:1 3:1 4:0.880046 5:2.51046 6:0.875912
Prediction accuracy is defined as the ratio of
7:0.634249 8:1.23119 9:1.59314 10:29
events correctly predicted to all the events
-1 1:0 2:0 3:0 4:1.16702 5:1.25523 6:1.0219
encountered. This prediction accuracy is of two
7:3.38266 8:1.50479 9:1.83824 10:2
types: hot method prediction accuracy and total
+1 1:2 2:2 3:2 4:1.47312 5:0.83682 6:1.89781
prediction accuracy. Hot method prediction accuracy
7:1.47992 8:2.59918 9:2.81863 10:3
is the ratio of correct hot method predictions to the
actual number of hot methods in a program, whereas
Figure 4: Sample training data set
total prediction accuracy is the ratio of correct
predictions (either hot or cold) to the total number of
The general format of a feature vector is
methods in a program. Hot method prediction
yi 1:xi1, 2:xi2, 3:xi3, …….j:xij
accuracy is evaluated at three hot method threshold
where the labels 1, 2, 3,.... , j are the feature numbers
levels: 50%, 40% and 30%.
and xi1, xi2, ...., xij are their corresponding feature
The leave-one-out cross-validation method is
values. Fig. 4 shows a sample of three feature vectors
used in evaluating this system. This is a standard
from the training dataset collected for the user-
machine learning technique where ‘n’ benchmark
defined methods found in the SPEC benchmark
programs are used iteratively for evaluation. One out
program. The first feature vector in Fig. 4 is a hot
of the ‘n’ programs is used for testing and the rest ‘n-
method and is labeled +1. The values of the ten
1’ programs are used for training the model. This is
features are serially listed for example ‘1’ is the
repeated for all the ‘n’ programs in the benchmark
value of feature 1 and ‘29’ of 10. The value ‘1’ of

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 4 78
suite.
Total Method Prediction Accuracy
5.2 Benchmarks Hot Method Thresholds 50% 40% 30%
Two benchmark suites, SPEC CPU2000 INT 120
[17] and UTDSP [13] have been used for training

Prediction Accuracy %
100
and prediction. UTDSP is a C benchmark and SPEC
80
CPU2000 INT has C and C++ benchmarks.
Evaluation of the system is based on only the C 60

programs of either benchmark. The model trained 40

from the ‘n-1’ benchmark programs in the suite is 20


used to predict the hot methods in the missed out 0
benchmark program.

2
p

cf

x
er
c

e
f
ol
ip
gc

r te
zi

ag
m

rs

tw
bz
g

6.

1.

vo
pa

er
4.

0.
6.
17

18

Av
16

5.
7.

30
25
25
19
5.3 Tools and platform SPEC CPU2000 INT
The system is implemented in the Low Level
Virtual Machine (LLVM) version 1.6 [6]. LLVM is Figure 7: Total prediction accuracy on the SPEC
an open source compiler infrastructure that supports CPU2000 INT benchmark
compile time, link time, run time and idle time
optimizations. The results are evaluated on an Intel The total method prediction accuracy on the
(R) Pentium (R) D with 2.80 GHz and 480MB of SPEC CPU2000 INT and UTDSP benchmark suites
RAM running Fedora Core 4. This system uses the is shown in Fig. 7 and 9. The total method prediction
libSVM tool [7]. It is a simple library for Support accuracy for all C programs on the SPEC CPU2000
Vector Machines written in C. INT varies from 36 % to 100 % with an average of
68.43%, 71.14% and 71.14% for the three hot
method thresholds respectively. This averages to
Hot Method Prediction Accuracy
70.24%. The average prediction accuracies obtained
Hot Method Thresholds 50% 40% 30% on the UTDSP benchmark suite are 69%, 71% and
120 58% respectively for 50%, 40% and 30% hot method
thresholds. This averages to 66%. Overall the system
Prediction Accuracy %

100

80
predicts both hot and cold methods in a program with
68.15% accuracy.
60

40 7 CONCLUSION AND FUTURE WORK


20

0
Optimizers depend on profile information to
identify the hot methods of program segments. The
2
p

cf

x
er
c

e
f
ol
ip
gc

rte
zi

ag
m

major inadequacy associated with the dynamic


rs

tw
bz
g

6.

1.

vo
pa

er
4.

0.
6.
17

18

Av
16

5.
7.

30
25

optimization technique is the high cost of accurate


25
19

SPEC CPU2000 INT


data profiling via program instrumentation. In this
Figure 6: Hot method prediction accuracy on the work, a method is worked out to identify hot
SPEC CPU2000 INT benchmark methods in a program using the machine learning
algorithm, the SVM. According to our study, with a
6 RESULTS set of ten static features used in training the system,
the derived model predicts total methods within a
Fig. 6 shows the prediction accuracy of the program with 68.15% accuracy and hot methods
trained model on the SPEC CPU2000 INT with 62.57% accuracy. However, hot method
benchmark program at three different hot method prediction is of greater value because optimizations
thresholds: 50%, 40% and 30%. The hot method will be more effective in these methods.
prediction accuracy for all C programs on the Future work in this area is aimed at improving
benchmark is found to vary from 0 % to 100 % with the prediction accuracy of the system by identifying
an average of 57.86 %, 51.43% and 39.14% for the more effective static and dynamic features of a
three hot method thresholds respectively. This program. Further research in this system can be
averages to 49.48% on the SPEC CPU2000 INT extended to enhance it to a dynamic hot method
benchmark suite. Similarly, on the UTDSP prediction system which can be used by dynamic
benchmark suite, in a 0% to 100% range, the hot optimizers. Applying this approach, the prediction
method prediction accuracy averages for the three accuracy of the other machine learning algorithms
thresholds are 84%, 81% and 62% respectively. can be evaluated to build additional models.
This averages to 76% on the UTDSP benchmark
suite. Overall, this new system can obtain 62.57%
hot method prediction accuracy.

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 5 79
Hot Method Prediction Accuracy
Hot Method Thresholds 50% 40% 30%
100
Prediction Accuracy %

80

60

40

20

t
ir
c
2

ff t
m

em

m
s

fi r
ss

eg

ge
m

iir
ee

ng
ct

ul
tra
lp

l li

sf
72

nr
pc

ra
te
e

ra
tr e
jp
sL

lm
Fu

ec

od
pr

t
de

og
ad

la

e
cu

dy
m

m
sp

Av
st
_
co

2.
ar
ge

en

hi

V3
M
ed

W
1.

1.
72

72

UTDSP benchmark
G

Figure 8: Hot method prediction accuracy on the UTDSP benchmark.

Total Prediction Accuracy


Hot Method Thresholds
50% 40% 30%
100
Prediction Accuracy %

80

60

40

20

e
ff t

fir

i ir
ng

eg
ct

t
c

ir
m

rm
em
ee

llis
m

ra

ul
ss

ag
lp
72

sf
te
pc

ra

m
jp
Fu

tn
L
e

tre

lm

er
ec

od
e

G
s
pr
ad

og

la
_d

cu

dy

Av
sp

m
m

st
ge

en
ar

2.
co

hi
M

V3
ed

W
1.

1.
72

72

UTDSP benchmark
G

Figure 9: Total Prediction Accuracy on the UTDSP benchmark.

International Conference on Compiler


8 REFERENCES Construction (CC 2006), 2006.
[6] C. Lattner and V. Adve: LLVM: A compilation
[1] Matthew Arnold, Stephen Fink, David Grove, framework for lifelong program analysis &
Michael Hind, and Peter F. Sweeney: A Survey transformation, In Proceedings of the 2004
of Adaptive Optimization in Virtual Machines, International Symposium on Code Generation
Proceedings of the IEEE, pp. 449-466, and Optimization (CGO’04), March 2004.
February 2005. [7] Chih-Chung Chang and Chih-Jen Lin:
[2] A. Monsifrot, F. Bodin, and R. Quiniou: A LIBSVM : a library for support vector
machine learning approach to automatic machines, 2001. Software available at
production of compiler heuristics, In http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Proceedings of the International Conference on [8] M. Stephenson, S. Amarasinghe, M. Martin,
Artificial Intelligence: Methodology, Systems, and U. M. O’Reilly: Meta optimization:
Applications, LNCS 2443, pp. 41-50, 2002. Improving compiler heuristics with machine
[3] S. Long and M. O'Boyle: Adaptive java learning, In Proceedings of the ACM
optimization using instance-based learning, In SIGPLAN Conference on Programming
ACM International Conference on Language Design and Implementation
Supercomputing (ICS'04), pp. 237-246, June (PLDI’03), pp. 77–90, June 2003.
2004. [9] F Agakov, E Bonilla, J Cavazos, G Fursin, B
[4] John Cavazos and Michael F.P. O'Boyle: Franke, M.F.P. O'Boyle, M Toussant, J
Automatic Tuning of Inlining Heuristics, 11th Thomson, C Williams: Using machine learning
International Workshop on Compilers for to focus iterative optimization, In Proceedings
Parallel Computers (CPC 2006), January 2006. of the International Symposium on Code
[5] John Cavazos, J. Eliot B. Moss, and Michael Generation and Optimization (CGO), pp. 295-
F.P. O'Boyle: Hybrid Optimizations: Which 305, 2006.
Optimization Algorithm to Use?, 15th [10] Christophe Dubach, John Cavazos, Björn

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 6 80
Franke, Grigori Fursin, Michael O'Boyle and /~mstephen/stephenson_phdthesis.pdf , M. W.
Oliver Temam: Fast compiler optimization Stephenson, Automating the Construction of
evaluation via code-features based performance Compiler Heuristics Using Machine Learning,
predictor, In Proceedings of the ACM PhD thesis, MIT, USA, 2006.
International Conference on Computing [20] J. Cavazos and J. Moss: Inducing heuristics to
Frontiers, May 2007. decide whether to schedule, In Proceedings of
[11] John Cavazos, Michael O'Boyle: Method- the ACM SIGPLAN Conference on
Specific Dynamic Compilation using Logistic Programming Language Design and
Regression, ACM Conference on Object- Implementation (PLDI), 2004.
Oriented Programming, Systems, Languages, [21] B.Calder, D.Grunwald, Michael Jones,
and Applications (OOPSLA), Portland, Oregon, D.Lindsay, J.Martin, M.Mozer, and B.Zorn:
October 22-26, 2006. Evidence-Based Static Branch Prediction Using
[12] John Cavazos: Automatically Constructing Machine Learning, In ACM Transactions on
Compiler Optimization Heuristics using Programming Languages and Systems
Supervised Learning, Ph.D thesis, Dept. of (ToPLaS-19), Vol. 19, 1997.
Computer Science, University of Massachusetts, [22] Daniel A. Jiménez , Calvin Lin: Neural
2004. methods for dynamic branch prediction, ACM
[13] C. Lee: UTDSP benchmark suite. In Transactions on Computer Systems (TOCS),
http://www.eecg.toronto.edu/~corinna/DSP/infr Vol. 20 n.4, pp.369-397, November 2002.
astructure/UTDSP.html, 1998. [23] Jeremy Singer, Gavin Brown and Ian Watson:
[14] G. Fursin, C. Miranda, S. Pop, A. Cohen, and Branch Prediction with Bayesian Networks, In
O. Temam: Practical run-time adaptation with Proceedings of the First Workshop on
procedure cloning to enable continuous Statistical and Machine learning approaches
collective compilation, In Proceedings of the applied to Architectures and compilation, pp.
5th GCC Developer’s Summit, Ottawa, Canada, 96-112, Jan 2007.
July 2007. [24] Culpepper B., Gondre M.: SVMs for Improved
[15] Vapnik, V.N.: The support vector method of Branch Prediction, University of California,
function estimation, In Generalization in UCDavis, USA, ECS201A Technical Report,
Neural Network and Machine Learning, 2005.
Springer-Verlag, pp.239-268, 1999. [25] Youfeng Wu, Larus. J. R. : Static branch
[16] S. Kotsiantis: Supervised Machine Learning: A frequency and program profile analysis, 1994.
Review of Classification Techniques, MICRO-27, Proceedings of the 27th Annual
Informatica Journal 31, pp. 249-268, 2007. International Symposium on Microarchitecture,
[17] The Standard Performance Evaluation pp: 1 – 11, 1994.
Corporation. http://www.specbench.org. [26] C.-L. Huang and C.-J. Wang: A GA-based
[18] M. Stephenson and S.P. Amarasinghe: feature selection and parameters optimization
Predicting unroll factors using supervised for support vector machines, Expert Systems
classification, In Proceedings of International with Applications, Vol. 31, Issue 2, pp: 231-
Symposium on Code Generation and 240, 2006.
Optimization (CGO), pp. 123-134, 2005.
[19] www.cag.csail.mit.edu

UbiCC Journal - VolumeUbiquitous


3 Computing and Communication Journal 7 81
PROPAGATION MODEL FOR HIGHWAY IN MOBILE
COMMUNICATION SYSTEM

K.Ayyappan *P. Dananjayan


Department of Electronics and Communication Department of Electronics and Communication
Engineering, Engineering
Rajiv Gandhi College of Engineering and Pondicherry Engineering College
Technology, Pondicherry - 605014, India.
Pondicherry, India. * [email protected]
*corresponding author

ABSTRACT

Radio propagation is essential for emerging technologies with appropriate design,


deployment and management strategies for any wireless network. It is heavily site
specific and can vary significantly depending on terrain, frequency of operation,
velocity of mobile terminal, interface sources and other dynamic factor. Accurate
characterization of radio channel through key parameters and a mathematical
model is important for predicting signal coverage, achievable data rates, specific
performance attributes of alternative signaling and reception schemes. Path loss
models for macro cells such as Hata Okumura, COST 231 and ECC 33 models are
analyzed and compared their parameters. The received signal strength was
calculated with respect to distance and model that can be adopted to minimize the
number of handoffs and avoid ping pong effect are determined. This paper
proposes a propagation model for highway environment between Pondicherry -
Villupuram which is 40 kilometers spaced out .Comparative study with real time
measurement obtained from Bharat Sanchar Nigam Limited (BSNL) a GSM based
wireless network for Pondicherry, India has been implemented.

Keywords: Handoff, Path loss, Received signal strength, ping pong, cellular mobile.

1 INTRODUCTION to model a path loss equation. To conceive these


models, a correlation was found between the received
Propagation models have traditionally focused on signal strength and other parameters such as antenna
predicting the received signal strength at a given heights, terrain profiles, etc through the use of
distance from the transmitter, as well as the variability extensive measurement and statistical analysis.
of the signal strength in a close spatial proximity to a Radio transmission in a mobile communication
particular location. Propagation models that predict system often takes place over irregular terrain. The
the signal strength for an arbitrary transmitter- terrain profile of a particular area needs to be taken
receiver (T-R) separation distance are useful in into account for estimating the path loss. The terrain
estimating the radio coverage area of a transmitter. profile may vary from a simple curved earth profile
Conversely, propagation models that characterize the to a highly curved mountainous profile. A number of
rapid fluctuations of the received signal strength over propagation models are available to predict path loss
very short travel distances are called small-scale or over irregular terrain. While all these models aim to
fading models. Propagation models are useful for predict signal strength at a particular receiving point
predicting signal attenuation or path loss. This path or in a specific location are called sector, the methods
loss information may be used as a controlling factor vary widely in their approach, complexity and
for system performance or coverage so as to achieve accuracy. Most of these models are based on a
perfect reception [1]. systematic interpretation of measurement data
The common approaches to propagation obtained in the service area. In cellular mobile
modeling include physical models and empirical communication systems, handoff takes place due to
models. In this paper, only empirical models are movement of mobile unit and unfavorable conditions
considered. Empirical models use measurement data inside an individual cell or between a numbers of

UbiCC Journal - Volume 3 1 82


adjacent cells [2, 3]. It is a seamless service to active received power decays with distance at a rate of 20
users while data transfer is in progress, so dB/decade.
unnecessary handoffs should be avoided. The path loss for free space model when antenna
Hard handoff suffers from ‘ping pong’ effect gains are included is given by
when the mobile users are near the boundaries of PL( dB) = -Gt - Gr +32.44 +20log ( d) + 20 log ( f ) (1)
adjacent cells and is a result of frequent handoffs. The
parameters measured to determine handoff are usually where
the received signal strength, the signal to noise ratio Gt is the transmitted antenna gain in dB,
and the bit error rate. However, a path loss model can Gr is the received antenna gain in dB,
increase the connection reliability. Hence, the choice d is the T-R separation distance in kilometers and
of path loss model plays an important role in the f is the frequency in MHz.
performance of handoffs. In this paper different path
loss models for macro cells such as Hata Okumura 2.2 The Hata Okumura model
model, Cost 231 model and ECC 33 model are
analyzed and compared their parameters. A The Hata-Okumura model is an empirical
propagation model for highway is proposed by formula for graphical path loss data provided by
modifying Cost 231 and Hata Okumura suburban Yoshihisa Okumura, and is valid from 150 MHz to
model and it is implemented in Pondicherry – 1500 MHz. The Hata model is a set of equations
Villupuram highway and compared its parameters based on measurements and extrapolations from
with experimental values. curves derived by Okumura. Hata presented the
The work is organized as follows. Section 2 urban area propagation loss as a standard formula,
describes the path loss models. Section 3 deals with along with additional correction factors for
the received signal strength for different path loss application in other situations such as suburban, rural
models. Section 4 discusses the models and the among others. The computation time is short and
results are evaluated. Section 5 concludes with the only four parameters are required in Hata model [7].
performance of various path loss models. However, the model neglects terrain profile between
transmitter and receiver, i.e. hills or other obstacles
2 PATH LOSS MODELS between transmitter and receiver are not considered.
This is because both Hata and Okumura made the
Path loss is the reduction in power of an assumption that transmitter would normally be
electromagnetic wave as it propagates through space. located on hills. The path loss in dB for the urban
It is a major component in analysis and design of link environment is given by
budget of a communication system. It depends on PL ( dB ) = A + B log ( d ) (2)
frequency, antenna height, receive terminal location where
relative to obstacles and reflectors, and link distance, d is distance in kilometer,
among many other factors. Macro cells are generally A represents a fixed loss that depends on frequency
large, providing a coverage range in kilometers and of the signal.
used for outdoor communication. Several empirical These parameters are given by the empirical formula
path loss models have been determined for macro
cells. Among numerous propagation models, the ( )
A = 69.55 + 26.16 log ( f ) -13.82 log h b - a ( h m )
following are the most significant ones, providing the
foundation of mobile communication services. The B = 44.9 - 6.55 log h b ( )
empirical models are where,
i. Hata Okumura model f is frequency measured in MHz,
ii. COST 231 model hb is height of the base station antenna in meters,
iii. ECC 33 model hm is mobile antenna height in meters and
These prediction models are based on extensive a (hm) is correction factor in dB
experimental data and statistical analysis, which For effective mobile antenna height a (hm) is given by
enable us to compute the received signal level in a
given propagation medium [5, 6]. The usage and
accuracy of these prediction models depends on the a [ hm ] = 1.1 log ( f ) - 0.7 hm - 1.56log ( f ) - 0.8
propagation environment.
The path loss model for highway is given by
For without noise factor
2.1 Free Space Propagation Model 2
f
PL ( dB) = PL ( dB)urban - 2 log - 5.4 (3)
In radio wave propagation models, the free space 28
model predicts that received power decays as a
For with noise factor
function of T-R separation distance. This implies that

UbiCC Journal - Volume 3 2 83


2 d is distance between base station and mobile (km)
f hb is BS antenna height in meters and
PL ( dB ) = PL ( dB )urban - 2 log (4)
28 hm is mobile antenna height in meters.

3 RECEIVED SIGNAL STRENGTH


2.3 COST-231 Hata model
In mobile communication, received signal
To extend Hata-Okumura- model for personal
strength is a measurement of power present in a
communication system (PCS) applications operating
received radio signal. Signal strength between base
at 1800 to 2000 MHz, the European Co-operative for
station and mobile must be greater than threshold
Scientific and Technical Research (COST) came up
value to maintain signal quality at receiver [9].
with COST-231 model. This model is derived from
Simultaneously signal strength must not be too strong
Hata model and depends upon four parameters for
to create more co-channel interference with channels
prediction of propagation loss: frequency, height of
in another cell using same frequency band. Handoff
received antenna, height of base station and distance
decision is based on received signal strength from
between base station and received antenna [8].
current base station to neighbouring base stations.
The path loss in urban area is given by
The signal gets weaker as mobile moves far away
from base station and gets stronger as it gets closer.
( )
PL ( dB) = 46.33+ 33.9 log ( f ) -13.82 log hb The received signal strength for various path loss
models like Hata Okumura model, Cost 231 model,
(5)
- a ( hm ) + 44.9 - 6.55 log ( hb ) log ( d) and ECC-33 model are calculated as

Pr= Pt +Gt+Gr-PL - A (7)


where a ( hm ) = 1.1 log ( f ) -0.7 hm - 1.56 log ( f ) -0.8 where,
Pr is Received signal strength in dBm,
The path loss calculation for highway is similar to Pt is transmitted power in dBm,
Hata-Okumura model. Gt is transmitted antenna gain in dB,
Gr is received antenna gain in dB,
2.4 ECC-33 model PL is total path loss in dB and
A is connector and cable loss in dB.
The ECC 33 path loss model, which is developed
by Electronic Communication Committee (ECC), is 4 PERFORMANCE ANALYSIS
extrapolated from original measurements by Okumura
and modified its assumptions so that it more closely Table 1: Simulation parameter
represents a fixed wireless access (FWA) system. The
path loss model is defined as, Parameters Values
PL ( dB ) = A fs + A bm - G t - G r (6) Base station transmitter power 43dBm
Mobile transmitter power 30dBm
where,
Afs is free space attenuation, Base station antenna height 35m
Abm is basic median path loss, Mobile antenna height 1.5m
Gt is BS height gain factor and
Gr is received antenna height gain factor. Transmitter antenna gain 17.5dB
They are individually defined as, Threshold level for mobile -102dBm
Afs = 92.4 + 20 log ( d) + log ( f ) and Threshold level for base station -110dBm
2 Frequency 900 MHz
Abm = 20.41+9.83log( d) +7.894log( f ) +9.56 log( f )
Connector loss 2 dB
h 2
Gt = log b 13.958+ 5.8 log( d)
{ } Cable loss 1.5dB
200
Duplexer loss 1.5dB
for medium city environments,
Maximum uplink loss 161.5dB
G r = 42.57 +13.7 log ( f ) log ( h m )-0.585
where, Maximum downlink loss 161.8dB
f is frequency in GHz,
The performance analysis is based on the network specification obtained from Bharat Sanchar
calculation of received signal strength, path loss Nigam Limited (BSNL), India shown in Table 1 is
between the base station and mobile from the used for evaluating the performance of path loss
propagation model. The GSM based cellular models.

UbiCC Journal - Volume 3 3 84


4.1 Path losses for various models The number of handoffs per call is relative to cell
size, smaller the cell size maximum the number of
handoff and the path loss model which cover
maximum distance will minimize the number of
handoff. The path loss calculation using Hata-
Okumura and COST 231 model are less than the
threshold value up to 19 km and ECC 33 model
exceed the threshold value at 5km.

Fig.1 Pondicherry- Villupuram highway map Path loss (dB)


Table 2 Path loss for various models

HATA
Distance COST ECC33
OKUMURA Distance (km)
(km) 231(dB) (dB)
(dB)
0.5 104.7 105.1 130.7 Fig.2 Comparison of Path loss

4.2 Received signal strength for various models


1 115.2 115.5 139.3
1.5 121.3 121.7 144.7 Table 3: BS to MS received power
2 125.6 126.0 148.6 COST HATA ECC
Distance
2.5 129.0 129.4 151.8 231 OKUMURA 33
(km)
(dBm) (dBm) (dBm)
3 131.7 132.1 154.5
0.5 -49.2 -49.6 -75.2
3.5 134.1 134.5 156.8
1 -59.7 -60.0 -83.8
4 136.1 136.5 158.8
1.5 -65.9 -66.1 -89.2
The path losses for various models are
calculated using eq (2),(5) and (6) between 2 -70.1 -70.5 -93.2
Pondicherry-Villupuram highway as shown in Fig.1
which is connected via Villianur, Kandamangalam, 2.5 -73.5 -73.9 -96.3
Madagadipet and Valavanur. The circle shown in
this figure are the base station transceiver (BTS) 3 -76.2 -76.6 -99.0
and identified by BTS number.The path loss values
are calculated and given in Table 2 and the 3.5 -78.6 -79.0 -101.3
comparison for various models are given in
Fig.2.The maximum allowable uplink loss for 4 -80.6 -81.0 -103.3
Nortel S8K base station transceiver is 161.5 dB and
maximum allowable downlink loss is 161.8 dB.

UbiCC Journal - Volume 3 4 85


Received signal strength (dBm)
Received signal strength (dBm)

Distance (km)
Distance (km)
Fig.3 Received signal strength for suburban models
Fig. 4 Received signal strength for highway models
The received signal strength for COST
The general area around highway could be
231, Hata-Okumura and ECC 33 models are
suburban because of location. The path loss
calculated using eq (7) shown in Table.3 for
calculation is a major concern in highways with and
suburban area and the comparison shown in Fig.3.
without noise level. In this paper the suburban model
The received signal strength using ECC 33 model is
is modified to highway with small correction factor
-103.3 dBm at four kilometers which is greater than
with respect to the location. RSS value for highway
threshold level of mobile receiver -102 dBm. The
without noise factor was calculated using suburban
received signal strength using COST 231 and Hata-
model but highway with noise factor was calculated
Okumura model values are less than the sensitivity
using additional correction factor of 5.4 with suburban
threshold of mobile. So these two models are
model. Here up to 2.5 km the received signal strength
preferred for maximum coverage area and reduce
was calculated using suburban model and beyond 2.5
the number of handoff.
km it was calculated with additional noise
factor of 5.4.
4.3 Highway propagation model

Table 4: BS to MS received power

Distance COST HATA Experiment


(km) 231 OKUMURA Value
(dBm) (dBm) (dBm)
0.5 -49.2 -49.6 -50

1 -59.7 -60.0 -57

1.5 -65.9 -66.1 -65

2 -70.1 -70.5 -68

2.5 -73.5 -73.9 -74

3 -81.6 -82.0 -82

3.5 -84.0 -84.4 -84

4 -86.0 -86.4 -86


Fig.5 Adjacent cell RSSI

UbiCC Journal - Volume 3 5 86


The Agilent Technologies provides Drive Test
solution for GSM networks. The Agilent Drive Test [2] Tomar G.S and Verma. S, “Analysis of handoff
E6474A model is used to calculate the amount of initiation using different path loss models in
signal received by a mobile in Pondicherry - mobile communication system”, Proceedings of
Villupuram highway as shown in Table 4.The IEEE International Conference on Wireless and
received signal strength for COST 231, Hata- Optical Communications Networks, Bangalore,
Okumura models are calculated and compared with India, Vol. 4, May 2006.
experimental value are shown in Fig 4. The
modified COST 231, Hata-Okumura suburban [3] Ken-Ichi, Itoh, Soichi Watanche, Jen-Shew Shih
models for highway are matches with the and Takuso safo, “Performance of handoff
experimental values. Algorithm Based on Distance and RSS
High RSS handover margins can result in poor measurements”, IEEE Transactions on vehicular
reception and dropped calls, while very low values Technology, Vol. 57, No.6, pp 1460-1468,
of handover margin can produce ping pong effects November 2002.
as mobile switches frequently between cells. The
existing cell broadcast control channel (BCCH-74) [4] Abhayawardhana V.S, Wassell.I.J, Crosby
and adjacent cells (BCCH-76, 66, 68,69and 78) D, Sellars. M.P. and Brown. M.G, “Comparison
received signal strengths are shown in Fig.5. The of empirical propagation path loss models for
optimum handover decision taken from drive test fixed wireless access systems”, Proceedings of
adjacent cell received signal strength indicator IEEE Conference on Vehicular Technology ,
(RSSI) as shown in Fig.5.For BCCH-74 the Stockholm, Sweden, Vol. 1, pp 73-77, June 2005.
corresponding base station identification code
(BSIC) is 41 and the received signal [5] Maitham Al-Safwani and Asrar U.H. Sheikh,
strength is -72 dBm. Consecutively, mobile user “Signal Strength Measurement at VHF in the
receives adjacent inter and intra cell received signal Eastern Region of Saudi Arabia”, The Arabian
strengths from BSIC 44,46and 41. The optimum Journal for Science and Engineering, Vol. 28,
handover decision taken from the drive test adjacent No.2C, pp.3 -18, December 2003.
cell received signal strength indicator (RSSI) to
improve the signal reception and reduce the number [6] S. Hemani and M. Oussalah, “Mobile Location
of dropped calls and also the ping-pong effect System Using Netmonitor and MapPoint server”,
Proceedings of Sixth annual Post graduate
5 CONCLUSION Symposium on the Convergence of
Telecommunication, Networking and
In this paper, different path loss models for Broadcasting, PGNet ,pp.17-22,2006.
macro cells were used. The calculated path loss is
compared with existing model like Hata-Okumura [7] T.S. Rappaport, “Wireless Communications”,
model, COST 231model and ECC 33 model. The Pearson Education, 2003.
received signal strength from base stations was
calculated and the calculated values are compared [8] William C.Y. Lee, “Mobile Cellular
with observed value between Pondicherry and Telecommunications”, McGraw Hill International
Villupuram highway. The result shows that Editions, 1995.
modified suburban model for highway using Hata-
Okumura and COST 231 model is closer to the [9] Ahmed H.Zahram, Ben Liang and Aladdin Dalch,
observed received signal strength and predicted to “Signal threshold adaptation for vertical handoff
be suitable model for highway received signal on heterogeneous wireless networks”, Mobile
strength calculation. Networks and application, Vol.11, No.4, pp 625-
640, August 2006.

REFERENCES

[1] Armoogum.V, Soyjaudah.K.M.S, Fogarty.T


and Mohamudally.N, “Comparative Study of
Path Loss using Existing Models for Digital
Television Broadcasting for Summer Season in
the North of Mauritius”, Proceedings of Third
Advanced IEEE International Conference on
Telecommunication, Mauritius Vol. 4, pp 34-
38, May 2007.

UbiCC Journal - Volume 3 6 87


A Hardware implementation of Winograd Fourier Transform
algorithm for Cryptography

1
G.A.Sathishkumar and 2Dr.K.Boopathy bagan
1
Assistant Professor, Department of Electronics and Communication Engineering, Sri Venkateswara
College of Engineering, Sriperumbudur -602108.
2
Professor, Department of Electronics, Madras Institute of Technology, Anna University Chrompet Campus,
Chennai-600044
Tamil Nadu.
INDIA
[email protected] , [email protected]

ABSTRACT
This paper presents a hardware implementation of efficient algorithms that uses the
mathematical framework. The framework based on the Winograd’s Fourier Transform
Algorithm, obtaining a set of formulations that simplify cyclic convolution (CC) computations
and CRT. In particularly, this work focuses on the arithmetic complexity of a multiplication
and when there is multiplication then the product represents a CC computational operation.
The proposed algorithms is compared against existing algorithms developed making use of the
FFT and it is shown that the proposed algorithms exhibit an advantage in computational
efficiency .This design is most useful when dealing with large integers, and is required by
many modern cryptographic systems.
The Winograd Fourier Transform Algorithm (WFTA) is a technique that combines
the Rader’s index mapping and Winograd’s short convolution modules for prime-factors into
a composite-N Fourier Transform structure with fewer multipliers (O(N) ). While theoretically
interesting, WFTA’s are very complicated and different for every length. It can be
implemented on modern processors with few hardware multipliers and hence, is very useful in
practice today.

Keywords: Discrete Fourier Transform, Fast Fourier Transform, Winograd’s Theorem,


Chinese Remainder Theorem.

INTRODUCTION

Many popular crypto-systems like the RSA WFTA was the maximization of performance
encryption scheme [12], the Diffie-Hellman on several levels, including the implemented
(DH) key agreement scheme [13], or the hardware algorithms.
Digital Signature Algorithm (DSA) [14] are
based on long integer modular exponentiation. In digital signal processing, the design of fast
A major difference between the RSA scheme and computationally efficient algorithms has
and cryptosystems based on the discrete been a major focus of research activity. The
logarithm problem is the fact that the modulus objective, in most cases, is the design of
used in the RSA encryption scheme is the algorithms and their respective implementation
product of two prime numbers. This allows in a manner that perform the required
utilizing the Chinese Remainder Theorem computations in the least amount of time. In
(CRT) in order to speed up the private key order to achieve this goal, parallel processing
operations. From a mathematical point of has also received a lot attention in the research
view, the usage of the CRT for RSA community [1]. This document is organized as
decryption is well known. However, for a follows. First, mathematical foundations
hardware implementation, special multiplier needed for the study of algorithms to compute
architecture is necessary to meet the the DFT and FFT algorithm are summarized.
requirements for efficient CRT-based Second, identification is established between
decryption. This paper presents the basic Winograd Fourier Transform and the Rader’s
algorithmic and architectural concepts of the algorithm. Third, the algorithm development
WFTA crypto chip, and describes how they for the basic problem of the multiplication,
were combined to provide optimum using the conceptual framework developed in
performance. The major design goal with the the two previous sections, is explained. The

UbiCC Journal - Volume 3 88


section also presents several signal flow into many smaller DFT’s of sizes N1 and N2,
diagrams that may be implemented in diverse along with O(N) multiplications by complex
architectures by means of very large scale roots of unity traditionally called twiddle
integration (VLSI) or very high-speed factors. The most well-known use of the
integrated circuits hardware description Cooley-Tukey algorithm is to divide the
language (VHDL). Conclusions, contributions, transform into two pieces of size N / 2 at each
and future development of the present work are step, and is therefore limited to power-of-two
then summarized. sizes, but any factorization can be used in
general (as was known to both Gauss and
1. DISCRETE FOURIER TRANSFORM Cooley/Tukey). These are called the radix-2
(DFT) and mixed-radix cases, respectively (and other
variants such as the split-radix FFT have their
1.1 DEFINITION own names as well). Although the basic idea is
recursive, most traditional implementations
The discrete Fourier transform (DFT) is a rearrange the algorithm to avoid explicit
powerful reversible mapping transform for recursion. In addition, because the Cooley-
discrete data sequences with mathematical Tukey algorithm breaks the DFT into smaller
properties analogous to those of the Fourier DFTs, it can be combined arbitrarily with any
transform and it transforms a function from other algorithm for the DFT.
time domain to frequency domain. For length n
input vector x, the DFT is a length n vector X, 1.2 Multiplication of large integers
with n elements:
The fastest known algorithms [1, 8,
(1) and 10] for the multiplication of very large
integers use the polynomial multiplication
method. Integers can be treated as the value of
A simple description of these a polynomial evaluated specifically at the
equations is that the complex numbers Fj number base, with the coefficients of the
represent the amplitude and phase of the polynomial corresponding to the digits in that
different sinusoidal components of the input base. After polynomial multiplication, a
"signal" xk. The DFT computes the Fj from the relatively low-complexity carry-propagation
xk, while the IDFT shows how to compute the step completes the multiplication.
xn as a sum of sinusoidal components
Fjexp(2πikn/N) / N with frequency (k/N)
cycles per sample. By writing the equations in
this form, we are making extensive use of 2. WINOGRAD FOURIER TRANSFORM
Euler's formula to express sinusoids in terms ALGORITHM (WFTA)
of complex exponentials, which are much
easier to manipulate. The number of The Winograd Fourier Transform
multiplication and addition operations required Algorithm (WFTA)[7] is a technique that
by the Discrete Fourier Transform (DFT) is of combines the Rader’s index mapping and
order N2 as there are N data points to calculate, Winograd’s short convolution algorithm for
each of which requires N arithmetic prime-factors into a composite-N Fourier
operations. To be exact, the input, if complex, Transform structure with fewer multipliers.
would contain 2 terms and every exponential The Winograd algorithm, factorizes zN − 1 into
term would contain 2 terms. So, this would cyclotomic polynomials—these often have
quadruple the computational complexity, thus coefficients of 1, 0, or −1, and therefore
number of multiplications is 4N2. Hence, for require few (if any) multiplications, so
real inputs the required number of Winograd can be used to obtain minimal-
multiplications will be 2N2. multiplication FFTs and is often used to find
A fast Fourier transform (FFT) [2] is efficient algorithms for small factors. Indeed,
an efficient algorithm to compute the discrete Winograd showed that the DFT can be
Fourier transform (DFT) and it’s inverse. computed with only O(N) irrational
FFT’s are of great importance to a wide variety multiplications, leading to a proven achievable
of applications, from digital signal processing lower bound on the number of multiplications
and solving partial differential equations to for power-of-two sizes; unfortunately, this
algorithms for quick multiplication of large comes at the cost of many more additions, a
integers. By far the most common FFT is the tradeoff no longer favorable on modern
Cooley-Tukey algorithm. This is a divide and processors with hardware multipliers. In
conquer algorithm that recursively breaks particular, Winograd also makes use of the
down a DFT of any composite size N = N1N2

UbiCC Journal - Volume 3 89


PFA as well as an algorithm by Rader for
FFTs of prime sizes.
for k=0,…,N-2. We notice that right side of the
2.1 RADER’S ALGORITHM above equation is a cyclic convolution i.e.,

Rader's algorithm (1968)[7,10] is a N−2 n+k mod(N−1)


fast Fourier transform (FFT) algorithm that X[gk mod N]− x(0) = ∑x[gn mod N] W (4)
computes the discrete Fourier transform (DFT) n=0 N
of prime sizes by re-expressing the DFT as a
cyclic convolution.

Since Rader's algorithm only depends


x[g0 mod N], x[g1mod N],...x[gN−2 mod N]]
upon the periodicity of the DFT kernel, it is n−2mod( N−1)
directly applicable to any other transform (of ⊕[WN,WNg ,WNg ] (5)
prime order) with a similar property, such as a
number-theoretic transform or the discrete
Hartley transform. Thus, an N-point DFT are converted into an

The algorithm can be modified to N-1 point Cyclic convolution.


gain a factor of two savings for the case of
DFTs of real data, using a slightly modified re-
indexing/permutation to obtain two half-size 3.WINOGRAD’S SMALL
cyclic convolutions of real data; an alternative CONVOLUTION ALGORITHM
adaptation for DFTs of real data, using the
This algorithm performs the
discrete Hartley transform, was described by
convolution with minimum number of
Johnson.
multiplications and additions and thus the
Winograd extended Rader's algorithm computational complexity of the process is
to include prime-power DFT sizes pm, and greatly reduced. Cyclic convolution is also
today Rader's algorithm is sometimes known as circular convolution. Let h={h0, h1, …
described as a special case of Winograd's FFT hn-1 } be the filter coefficients and
algorithm, also called the multiplicative x={x0,x1,….xn-1} be the data sequence. The
Fourier transform algorithm, which applies to cyclic convolution can be expressed as
an even larger class of sizes.
s(p)=hOn x=h(p)x(p)mod(pn-1). (6)
2.2 Algorithm
The cyclic convolution can be computed as a
The Rader algorithm to compute the DFT, linear convolution reduced by modulo pn-1.
Alternatively, the cyclic convolution can be
N−1 nk computed using CRT with m(p)=pn-1,which is
X(K) = ∑x(n)W k, n∈ZN;ord(W) = N (2) much simpler. Thus, Winograd's minimum-
multiply DFT's are useful only for small N.
n=0 N N
They are very important for Prime-Factor
Algorithms, which generally use Winograd
is defined for prime length N. We first modules to implement the short-length
DFT’s[10]. The theory and derivation of these
compute the DC component with algorithms is quite elegant but requires
N −1 substantial background in number theory and
x (0 ) = ∑
n = 0
x (n ) (3 ) abstract algebra. Fortunately, for the
practitioner, the entire short logarithm one is
likely to need have already been derived and
Because N=p is a prime, it is known that there can simply be looked up without mastering the
is primitive element, a generator ‘g’, that details of their derivation.
generates all the elements of n and k in the 3.1 Algorithm
field ZN
1. Choose a polynomial m(p) with degree
We substitute n with gk mod N and get the
higher than the degree of h(p)x(p) and
following index transform:
factor it into k+1 relatively prime polynomials
with real coefficients, i.e.,

UbiCC Journal - Volume 3 90


m(p) = m(0)(p)m(1)(p).....m(k)(p). (7) Project Navigator, we can run the
implementation process in one-step, or we can
2. Let M(i)(p) = m(p) / m(i)(p) and use the run each of the implementation processes
separately.
Chinese Remainder Theorem (CRT)
(i)
algorithm to get N (p).

3. Compute

h(i)(p) = h(p) mod m(i)(p),


(8)

x (i)(p) = x(p) mod m(i)(p),


(9)

for i=0,1,2,...,k.

4. Compute s(i)(p) = h(i)(p)x(i)(p) mod m(i)(p), (10)

for i =0,1,...,k.

5. Compute s(p) using the equation:

s(p)= ∑(i=0 to k) s(i)(p)N(i)(p)M(i)(p) mod m(p).


(11)

The computational complexity in case of


WFTA is of the order of N, O(N). It has been
Figure 1 Simulation result for N=2
found that the number of multipliers required
by WFTA is always less than 2N, which
drastically reduces the hardware needed for
implementing a DFT block.

4. REALIZATION OF WFTA IN
VERILOG HDL

The behavioral simulation and synthesis of


WFTA for N=2 and 5 can be viewed in the
following descriptions. We focus on the
Verilog HDL [6] used for our simulation and
synthesis. It is shown in Fig.1, 2, 3, 4,5and 6.

4.1.1 SYNTHESIS RESULTS

Synthesis of WFTA is being


implemented using XILINX ISE 9.1i tool.
After design entry and optional simulation, we
run synthesis. During this step, VHDL,
Verilog, or mixed language designs become
net list files that are accepted as input to the
implementation step.
Figure 2 Simulation result for N=5
4.1.2 IMPLEMENTATION

After synthesis, we run design


implementation, which converts the logical
design into a physical file format that can be
downloaded to the selected target device. From

UbiCC Journal - Volume 3 91


Figure 5 Schematic of DFT N=5

Figure3 Schematic of DFT N=2

Figure 4 Schematic of WDFT N=2

Figure 6 Schematic of WDFT N=5

UbiCC Journal - Volume 3 92


Values of Multipliers Multipliers only least number of multiplier are required.
The future scope of the work is to reduce the
N required in required in minimum multiplier so that, the algorithm
DFT WFTA occupy less space and consume less power.

Thus Winograd’s Fourier transform


2 8 0 algorithm for N=2, 3 and 5 have been realized
and the WFTA has been proved accurate by
3 8 2 comparing the values obtained using WFTA
with those of DFT using MATLAB.
Behavioral simulation of the WFTA was done
5 50 5
using XILINX ISE Simulator and the synthesis
was implemented in XILINX ISE tool. A full
Table 1 Comparison of DFT and WFTA RTL schematic is obtained with logic gate
level modeling. Finally, the circuit was
4.1.3 COMPARISON OF THE OUTPUT implemented using FPGA Spartan 3 kit.
VALUES OF DFT AND WDFT FOR N=2, 3
AND 5 The WFTA for the transform lengths
equal to powers of prime numbers, i.e. N=2m,
The output of normal DFT and WDFT 3m and 5m, can be obtained by iteratively using
coefficients are obtained by using MATLAB the same skeleton that we have used. Even
software, as A0, A1, A3, A4 and A5.It is though this project drastically reduces the
shown in Fig.7. number of multipliers required to compute the
DFT, it doesn’t eliminate its use. Hence, a new
algorithm can be devised based on theories of
WFTA to implement DFT with no multipliers
and reduced adders.

5. REFERENCES

[1] H. Krishna, B. Krishna, K. Y. Lin, J. D.


Sun, Computational Number Theory and
Digital Signal Processing (CRC Boca Raton,
Florida, pp 471-485,(1994).

[2].C. S. Burrus and P.W. Eschenbacher, “An


in-place in-order prime factor FFT algorithm”,
IEEE Trans. Acoust. Speech Signal
Processing, vol. ASSP-29, no. 4, pp. 806–817,
(1981).

[3].A. M. Despain, “Very fast Fourier


transform algorithms hardware for
implementation,” IEEE Trans. Computres.,
vol. C-28, no. 5, pp. 333–341, May (1979).

[4].Keshab Parhi, “VLSI Digital Signal


Processing Systems: Design and
implementation”Wiely., pp.237-244 (1999).

Figure 7 Comparisons of normal dft and wdft [5].M. D. Macleod and N. L. Bragg, “A fast
hardware implementation of the Winograd
4.2 CONCLUSION AND FUTURE Fourier transform algorithm,” Electron. Lett.,
ENHANCEMENTS vol. 19, pp. 363–365, May (1983).

Most of the present day cryptography [6].Uwe Meyer-Baese, “Digital signal


algorithms require complex multiplications processing with Field Programmable Gate
and more multipliers are required. With the use Arrays” Springer-verlag, pp 273-276, (2007)
of Winograd’s Fourier transform algorithm

UbiCC Journal - Volume 3 93


[7].S. Winograd, “On computing the discrete University, MIT Chrompet campus, Chennai.
Fourier transform,” Math. Comp., vol. 32, no. His areas of interest include VLSI, Image
141, pp. 175–199, Jan. (1978). processing, Signal processing and network
[8]. J. McClellan and C. Rader, Number Security.
Theory in Digital Signal Processing.
Englewood Cliffs, NJ: Prentice Hall, pp 79-85
(1979).

[9].S. Winograd, Arithmetic complexity of


computations (Society for Industrial and
Applied Mathematics, (1980).

[10]. M. Heideman, Multiplicative complexity,


convolution, and the DFT Springer Verlag,
New York ( 1988)

[11]. J. Cooley, Some applications of


computational complexity Theory to Digital
Signal Processing. 1981 Joint Automatic
Contr. Conf. University of Virginia, June 17-
19 (1981).

[12]. R. L. Rivest, A. Shamir, and L. Adleman.


A Method for Obtaining Digital Signatures and
Public Key Cryptosystems. Communications
of the Association for Computing Machinery,
21(2) pp. 120–126, February (1978).

[13]. W. Diffie and M. E. Hellman. New


Directions in Cryptography. IEEE
Transactions on Information Theory, IT-22(6)
pp. 644–654, November (1976).

[14]. National Institute of Standards and


Technology (NIST). FIPS Publication 186:
Digital Signature Standard. National Institute
for Standards and Technology, Gaithersburg,
MD, USA, May (1994).

Sathishkumar.G.A obtained his M.E from PSG


college of Technology, Coimbatore, India. He
is currently perusing PhD from Anna
University, Chennai and Faculty member in
the Electronics and Communication Dept of
Sri Venakesateswara College of Engineering,
Sriperumbudur.His research interest is VLSI
Signal processing Algorithms, Image
Processing and Network Security.

Dr.K.Boopathy Bagan completed his doctoral


degree from IIT Madras. He is presently
working as professor, ECE dept, in Anna

UbiCC Journal - Volume 3 94


A MULTIAGENT CONCEPTULIZATION FOR SUPPLY-CHAIN
MANAGEMENT

Vivek kumar , Amit Kumar Goel , Prof. S.Srinivisan


Department of computer science & Engineering Gurgaon Institute Technology Management, India
[email protected], [email protected]

ABSTRACT

In Global world there is a huge network consisting by different companies for their suppliers, warehouses,
distribution centers, retailers, with the help of these entities any organization acquired raw material ,
transformed , and delivered finished goods. The major challenges for Industrial organization are to reduce
product development time, improve quality, and reduce cost of production. This is done only when the
relationship among various organization/industrial houses is good, This is not only be done by the change of
Industrial process/ method but the latest electronic tools controlled by computer software is required to establish
in this competitive world. Software agents consist of one or many responsibility of supply chain, and each agent
can interact with others without human intervention in planning and execution of their responsibilities. This
paper present solution for the construction, architecture, coordination and designing of agents. This paper
integrates bilateral negotiation, Order monitoring system and Production Planning and Scheduling multiagent
system.

KeyWords: Agent, Supply chain, Multiagent, Multiagent System Architecture for supply chain
management

1. INTRODUCTION

To improve the efficiency of supply chain the different activities of supply chain can be
management it is mandatory to take intelligent, distributed in agents. A typical example of
tactical, strategic and good operational decision at multiagent system is taken with the help of Coffee
each end of chain. Under strategic decision the agent maker and toast maker. Let a person wants the toast
will take the decision about suppliers, warehouses, as the coffee is ready, means the coordination
production units, transportation system etc. The between Coffee maker and toast maker is essential.
tactical decision takes place to meet actual demand. Otherwise many situation may be raised like Coffee
The agent on operational level is responsible to is ready but toast is not prepared and it comes after
execute whole plan. To do all things in smooth way some time or the toast is ready and the Coffee is not
the coordination among agents is must otherwise if prepared.
the material do not arrive on time the production will 1.Agents are problem solvers.
stop, if finished good has been ready and warehouses
are not empty then it will create a great confusion. 2.Agents are pro-active.
The ability to manage all level of supply chain 3.Agents are goal-oriented.
system [1], coordination, accurate and timely
dissimilation of information is the enterprise goal. 4.Agents are context-ware.
5.Agents are autonomous
2 Agent
2.1 Requirement / Logistics agent
In Software we can define the agent that it is an These agents coordinate all activities of plant and
entity which consists of various attributes defines a find the various demands of various sections. It holds
particular domain. Exp: An agent deals with the data of day to day production, find how much
warehousing consist its local attribute as well as the material has been consumed a day depending on the
details which will be coordinated with other entity working hours a machine works. Categorized each
(Agents). So agents emulate the mental process or component in different table and coordinates with
simulate the rational behavior. A multi-agent system other agent like Demand agent etc. The intelligent
is a loosely coupled network of problem-solver part of the agent is to find the efficiency of machine,
entities that work together to find answers to minimizing cost increasing throughput etc. It can
problems that are beyond the individual capabilities also consist feedback of the finished goods and
or knowledge of each entity. The first issue is how suggest appropriate changes if required.

UbiCC Journal - Volume 3 95


2.2 Demand Agent

This agent coordinates with other agent like and generating schedules that are sent to the
requirement/logistics agent. The main objective of dispatching agent for execution. It assigns resources
this agent is to fulfill the requirement of various new orders and start times to activities that are
section of the company/customer. The intelligent part feasible while at the same time optimizing certain
of this agent is to acquire orders from various criteria such as minimizing work in progress or
vendors, compare them on the basis of quality, price, tardiness. It can generate a schedule from scratch or
availability etc. In case any demand increases or repair an existing schedule that has violated some
decreases automatically vendor will be constraints. In anticipation of domain uncertainties
communicated. like machine breakdowns or material unavailability,
the agent may reduce the precision of a schedule by
2.3 Transport agent increasing the degrees of freedom in the schedule for
the dispatcher to work with. For example, it may
This agent is responsible for the availability of the “temporally pad” a schedule by increasing an
transport, dispatching of finished goods at particular activity’s duration or “resource pad” an operation by
destination. It manages all the transportation routes. either providing a choice of more than one resource
or increasing the capacity required so that more is
2.4 Financial agent available.

This agent is responsible to avail the money for 3 MIDDLE AGENTS


purchasing any material. It coordinates with other
agents analyze the cost and ensure that the money 3.1 Facilitators
has been paid to the party in definite time.
Agents to which other agents surrender their
2.5 Scheduling agent autonomy in exchange for the facilitator' s services.
Facilitators can coordinate agents'activities and can
This agent is responsible for scheduling and satisfy requests on behalf of their subordinated
rescheduling activities in the factory, exploring agents.
hypothetical “what-if” scenarios for potential new
orders

Protégé-2000 Mediator ATS


Agent

Agent System

Type

Database Data Mining


Behavior Module

Software XML
Application
Developer
Data

Fig 2: Architecture of Multiagent

UbiCC Journal - Volume 3 96


3.2 Mediators

Agents that exploit encoded knowledge to create • A common format for the content of
services for a higher level of applications. communication

3.3 Brokers • A shared ontology

Agents that receive requests and perform actions


using services from other agents in conjunction with 4. AGENT COMMUNICATION LANGUAGE
their own resources.
There are two main approaches to design a agent
3.4 Helpline/Yellow pages communication language [6], The first approach is
procedural and the second one is declarative .In
Agents that assist service requesters to find service procedural communication is based on executable
provider agents based on advertised capabilities. content but in declarative communication is based on
definition assumptions and declarative
3.5 Agent Interaction statement .one of the more popular declarative agent
languages (KQML)[8]
Interaction is one of the important features of an
agent [2]. In other words, agent recurrently 5. MULTIAGENT SYSTEM
interaction to share information and to perform task ARCHITECTURE FOR SUPPLY CHAIN
to achieve their goal. Researchers investigating MANAGEMENT
agent’s communication languages mention three key
elements to achieve multiagent interaction. Our framework provides a GUI application that
[3][4][5].A common agent communication language enables the design of multiagent system with protégé
and protocol -2000[7], as well as single agents or multiagent
communities using common drag and drop operation.

Retailer
Retailer Agent

Data
mining
looks

Warehouse Logistics

Warehouse
Plant
Warehouse Plant
Plant

Operation
Purchase Resource
Management

Scheduling

Supplier
Raw material

Fig-1-Architecture of Multiagent Supply Chain Management System

UbiCC Journal - Volume 3 97


6. FORMULATION OF BEHAVIOR TYPES

Behavior depends on the generic templates and work embedding specific knowledge into agents. This data
flow i.e. receiving and sending the message. Execute mining module receives the information from the
the stored application and gives necessary deriving XML document and executes the suitable data
decision using inference engine. mining functions designed by the application
There are four types of workflow terminals. developer. These models represented in Predictive
Modeling Markup Language [8] which is a data
mining standard defined by DMG (Data Mining
6.1 Add-on terminals Group) [9] provides the agent platform with
versatility and compatibility to other. Major data
For the addition of predefined function. mining software are Oracle SAS SPSS and MineIT
etc
6.2 Execute terminals

Execute the particular reasoning terminal. 7. CONCLUSIONS

6.3 Agent Types Information technology based solution frameworks


offer a way to more effectively integrate decision-
After the formulation of behavior type we get a new making by enabling better knowledge sharing and
agent type in order to be used later in multiagent facilitating more transparent economic transactions.
system development i.e. The multi-agent system paradigm promises to be a
valuable software engineering abstraction for the
Agent Type = Agent + Behavior development of computer systems. In addition, the
wide adoption of the Internet as an open environment
and the increasing popularity of machine-
6.4 Receiving terminals independent programming languages, such as Java,
make the widespread adoption of multi-agent
For the filtration of receiving information. technology a feasible goal

6.5 Sending terminals


REFERENCES
For the composition and then send further.

New agent can be created by existing one which will 1. Zhou, L.; Xu, X.; Huang, T. and Deng,
be a template for creating agent instances during the S.: Enterprise Interoperability: New Challenges and
design of a multiagent system architecture. Approaches,

6.6 Data base for agent 2. Nwana, H. S.: Software Agents: An Overview,
The Knowledge Engineering Review,
This unit acts as a storage facility to ensure inters October/November 1996, Vol. 11, No. 3, PP. 205-
functionality between all system components. In this 244.
system the database stores ontologies, behavior,
types of agent and the historical data to be mined. 3. Data Mining Group, the: Predictive Model
This unit can be designed by RMI. Markup Language Specifications (PMML)[1], ver.
2.0 available at: http://www.dmg.org
6.7 Agent training system (ATS)
4. Bradshaw, J. M.; Dutfield, S.; Benoit, P. and
This system gathers information from the data Woolley, J.D. KAoS: Toward An Industrial-Strength
mining procedure and then takes the decision and Open Agent Architecture, Software Agents,
sends this decision into the newly created agent. Bradshaw, J.M. (Ed.), Menlo Park, Calif., AAAI
Press, 1997, PP. 375-418.
6.8 Data Mining System
5. Russell, S. J. and Norvik, P.: Artificial
This system holds the implementation of data mining Intelligence: A Modern Approach, Prentice Hall,
algorithm executed by data mining procedures which Englewood Cliffs, N.J., 1995.
gives a new decision model which are again enabled
into agent via ATS. And also responsible for

UbiCC Journal - Volume 3 98


6. Genesereth, M.: An Agent-based Framework for
Interoperability, Software Agents, Bradshaw, J. M.
(Ed.), Menlo Park, Calif., AAAI Press, 1997, PP.
317-345.

7. Noy, N. F.; Sintek, M.; Decker S.; Crubezy, M.;


Fergerson, R. W. & Musen, M. A.: Creating
Semantic Web Contents with Protégé-2000, IEEE
Intelligent Systems Vol. 16, No. 2, 2001, PP. 60-71.

8. Finin, T., Labrou, Y. and Mayfield, J.: KQML as


an Agent Communication Language, Software
Agents, Bradshaw, J.M. (Ed.), Menlo Park, Calif.,
AAAI Press, 1997, PP. 291-316.

9. Huhns, M. N. and Singh, M. P.: Agents and Multi-


agent Systems: Themes, Approaches, and Challenges,
Readings in Agents, Huhns, M. N. and Singh, M. P.
(Eds.), San Francisco, Calif., Morgan Kaufmann
Publishers, 1998, PP. 1-23.

UbiCC Journal - Volume 3 99


SCALABLE ENERGY EFFICIENT AD-HOC ON DEMAND DISTANCE
VECTOR (SEE-AODV) ROUTING PROTOCOL IN WIRELESS MESH
NETWORKS

Sikander Singh
Research Scholar, Department of Computer Science & Engineering, Punjab Engineering College (PEC),
Deemed University, Sector-12, Chandigarh-160012 (India)
[email protected],

Dr. Trilok Chand Aseri


Sr. Lecturer, Department of Computer Science & Engineering, Punjab Engineering College (PEC), Deemed
University, Sector-12, Chandigarh-160012 (India)
[email protected]

ABSTRACT
A new routing protocol called Scalable Energy Efficient Ad-hoc on Demand
Distance Vector (SEE-AODV) having good scalable properties and energy efficient
than existing Ad hoc on Demand Distance (AODV) routing protocol in wireless
mesh networks has been proposed in this paper. Wireless mesh networks (WMNs)
consist of mesh routers and mesh clients, where mesh routers have minimal mobility
and form the backbone of WMNs. They provide network access for both mesh and
conventional clients. Two techniques called Clustering and Blocking-Expanding
Ring Search has been applied on existing AODV routing protocol to improve its
scalability and energy efficiency problem. Results shows that, performance of SEE-
AODV routing protocol is better than existing AODV routing protocol in wireless
mesh networks. To show the efficiency of our proposed routing protocol,
simulations have been done by using Network Simulator-2 (NS-2).

Keywords: Wireless mesh networks, Ad-hoc network, Routing, Distance vector.

1 INTRODUCTION applications have different technical requirements


and challenges in the design and deployment of mesh
Wireless mesh network (WMN) [1] technologies networking architectures, algorithms and protocols.
have been actively researched and developed as key The objective of this work is to develop routing
solutions to improve the performance and services of protocols for Wireless Mesh Networks and to
wireless personal area networks (WPANs), wireless analyze their performance by realizing different
local area networks (WLANs) and wireless environments. The analysis has been done
metropolitan area networks (WMANs) for a variety theoretically and through simulations using NS-2
of applications, such as voice, data and video. (Network Simulator-2). Objectives of the work are:-
Compared with mobile ad hoc networks (MANETs), 1. To simulate the proposed routing protocol,
wireless sensor networks (WSNs) and infrastructure- Scalable Energy Efficient-Ad-hoc on-demand
based mobile cellular networks, WMNs are (i) quasi- Distance Vector (SEE-AODV) for wireless mesh
static in network topology and architecture (ii) not networks.
resource constrained at mesh routers and (iii) easy 2. Evaluation of routing protocols based on various
and flexible to deploy. These technological parameters.
advantages are especially appealing to the emerging 3. Comparison of proposed protocol with existing
market requirements on future wireless networks and protocol.
services, such as flexible network architecture, easy
deployment and self-configuration, low installation 2 RELATED WORK
and maintenance costs and interoperable with the
existing WPAN, WLAN and WMAN networks. Wireless mesh networks has recently gained a
Potential applications of WMNs include broadband lot of popularity due to their rapid deployment and
home networking, community and neighborhood instant communication capabilities. These networks
networking, enterprise networking, building comprise of somewhat static multi-radio Mesh
automation and so on. These wide ranges of Routers [2], which essentially provide connectivity

UbiCC Journal - Volume 3 100


between the mobile single-radio Mesh Clients. AODV is a widely accepted routing protocol.
Special routing protocols are employed which There are several implemented version of AODV
facilitate routing between the Mesh Routers as well reported. One of the important principles in
as between the Mesh Routers and the Mobile Clients. designing AODV-Clustering is the coexistence with
AODV is a well known routing protocol that can AODV; it means nodes which implement AODV
discover routes on-the-fly in a mobile environment. protocol can work collaboratively with nodes that
However, as the protocol was actually developed for implement proposed protocol in the same network.
single-radio nodes, it frequently lacks the ability to To achieve this, keep all AODV route control
exploit the potential offered by the Mesh Routers. packets and add some new control packets as needed.
There are hundreds of proposed routing protocols [3], In fact AODV-Clustering let all nodes that haven’t
many of them have been standardized by IETF and joined a cluster works on AODV method.
have been in use for many years. Some of those (iii) Route Discovery Mechanism
protocols have proven themselves in the Internet and In AODV-Clustering, there are two route
are expected to continue to perform well for many discovery mechanisms for the nodes which join the
years to come. In the ad-hoc networking arena, cluster. One is Blocking-ERS, it can increase the
several classes of routing protocols have been efficiency of the route discovery; the other is
proposed and carefully analyzed. The WMN traditional RREQ flooding route discovery
companies [4] are using a variety of routing mechanism which extends from AODV. Normally
protocols to satisfy their needs. Furthermore Blocking-ERS is used first, if a suitable route can’t
proposed routing protocol takes advantage of the be found before timeout, then traditional route
fixed nodes in WMNs. In this paper some discovery mechanism is used.
enhancements are done to improve the existing (iv) Local Route Repair
AODV protocol so that it works well in wireless To reduce the number of lost data packets, when
mesh networks having good scalability and energy the route broke due to mobility, AODV-Clustering
efficient. let the node upstream of the break link perform a
local repair instead of issuing the RERR. If the local
3 SCALABLE ENERGY EFFICIENT AD- repair is successful, a RREP will be returned either
HOC ON DEMAND DISTANCE VECTOR by the destination or by a node with a valid route to
ROUTING PROTOCOL (SEE-AODV) the destination. In the event that a RREP is not
returned, the local repair is not successful and a
In this paper to develop SEE-AODV Routing RERR message is sent to the source node.
Protocol two techniques called Clustering and
Blocking Expanding Ring Search have been applied 3.2 AODV-Clustering Routing Scheme
to improve the performance of existing AODV (i) Route Discovery
routing protocol in wireless mesh networks. The At the beginning the status of all nodes is
performance of wireless mesh networks is highly “unassigned”. Source node broadcast [5] RREQ
dependent on routing protocol. AODV is a popular message to find a route, destination node or
routing protocol for wireless networks. AODV is intermediate node that has fresh route to destination
well suited for wireless mesh networks in that it has reply with RREP message to source. On the way
low processing & memory overhead and low which RREP pass by, one or several nodes are
network utilization. Additionally AODV provides selected as cluster head (CH) using a defined rule.
loop freedom for all routes through the use of CH node will broadcast CH message to its
sequence numbers. neighbours, the neighbour nodes which receive this
broadcast will act differently according to its roles.
3.1 Design Goals (a) For node whose status is “unassigned”, it will
The design goal of SEE-AODV routing protocol issue a Join Cluster Request to the broadcasting CH
is to improve the scaling potential of AODV and to and can become an ordinary cluster member after
make it energy efficient. The main feature of AODV- receiving the acknowledgment from this CH. In
Clustering includes:- AODV-Clustering, CH node can reach all his
(i) Gradualness members by one hop, so the protocol’s architecture is
The protocol first works on AODV method, then one level hierarchy.
gradually changes to a clustering route protocol. (b) For node which is an ordinary cluster member, it
There are several considerations about it: First, there will judge if the broadcast message sender is its
is central control node in mesh network; second, original CH, if yes, no action needed; else, send Join
using this method can also allow AODV nodes Cluster Request to the CH and become gateway node
coexisting with AODV-Clustering nodes; third, it after receiving acknowledgement from it. Being a
can reduce the overhead caused by frequently gateway, it will send Gateway Message to all CH
changing of cluster. nodes which it connect directly, let them put its
(ii) Coexist with AODV address in their Gateway Table.

UbiCC Journal - Volume 3 101


(c) For node which is a gateway, it will check its time in comparison to conventional route search
Gateway Node Table whether there is an entry for method.
this broadcasting CH node, if yes, no action needed; 3.3.1 Blocking-ERS
else, send Join Cluster Request to it and put its An alternative ERS scheme has been proposed to
address into Gateway Node Table after receiving support reactive protocols such as DSR (Distance
acknowledgement from it. Gateway nodes have two Source Routing) and AODV and it is called Blocking
tables, one is Gateway Node Table which contains Expanding Ring Search. The Blocking-ERS
cluster head address which it connect directly, integrates, instead of TTL sequences, a newly
another is Joint Gateway Table, which contains adopted control packet, stop instruction and a hop
address of the CH nodes which can be reached by 2 number (H) to reduce the energy consumption during
hops and also the nodes which help to reach these route discovery stage. The basic route discovery
CHs which are called Joint Gateway. structure of Blocking-ERS is similar to that of
(ii) Route Maintenance conventional TTL sequence-based ERS. One of the
AODV-Clustering extends many features from differences from TTL sequence-based ERS is that
AODV for route maintenance, for example, use the Blocking-ERS does not resume its route search
“hello” message to confirm the existence of procedure from the source node every time when a
neighbours, use RERR message to inform the nodes rebroadcast is required. The rebroadcast can be
affected by the link breakage. In addition to it, initialized by any appropriate intermediate nodes. An
AODV-Clustering adds some cluster related intermediate node that performs a rebroadcast on
maintenance operations such as joining cluster, behalf of the source node acts as a relay or an agent
leaving cluster and changing its status. node. Fig. 1 shows an example of Blocking ERS
(a) Joining Cluster approach in which the rebroadcasts are initialized by
The nodes whose status is “unassigned” will join and begins from a relay node M in rebroadcast round
the cluster after receiving the CH Broadcasting 2 and another relay node N in round 3 and so on. In
Message. To reduce overhead, no periodic CH Fig. 1 the source node broadcasts a RREQ including
Broadcasting is used. Instead, a “on demand” method a hop number (H) with an initial value of 1. Suppose
is used for new nodes to join cluster: when the node that a neighbour M receives the RREQ with H=1 and
whose status is “unassigned” broadcast RREQ to its the first ring was made. If no route node is found,
neighbours, a specific mark is set in RREQ, the mark that is, no node has the requested route information
will inform neighbouring CHs to let him join the to the destination node, the nodes in the first ring
cluster. Using this way, unassigned node that newly rebroadcast the RREQ with an increased hop number,
comes will have a chance to join the nearby cluster for example RREQ with H=2 is rebroadcast in this
when it has data to send. case The ring is expanded once again just like the
(b) Leaving Cluster normal expanding ring search in AODV except with
If the ordinary cluster member finds its CH node an extended waiting time.
unreachable, it will change its status to unassigned. The waiting time can be defined as
This means the node leaving the cluster. CH node Waiting Time = 2 × Hop Number
should change its status to unassigned when it has no The nodes in Blocking-ERS receiving RREQs
cluster members. need to wait for a period of 2H, i.e. 2×their hop-
(c) Changing Cluster number unit time before they decide to rebroadcast,
In AODV-Clustering, the change of cluster is not where the ‘unit time’ is the amount of time taken for
occurred immediately. When the node leave the a packet to be delivered from one node to one-hop
cluster, its status become “unassigned”, it works on neighbouring node.
AODV method until new CH node appear nearby
and it has a chance to join the cluster.

3.3 Route Discovery Approaches in Wireless


Mesh Networks
In Wireless mesh networks, nodes cooperating
for delivery of a successful packet form a
communication channel consisting of a source, a
destination and possibly a number of intermediate
nodes with fixed base station. In this paper some
inefficient elements have been found in well known
reactive protocols AODV and propose a new
approach for rebroadcasting in Expanding Ring
Search. This leads to the Blocking-ERS scheme, as Figure 1: Blocking-ERS
we call it, which demonstrates improvement in
energy efficiency at the expense of route discovery (A) Energy Consumption

UbiCC Journal - Volume 3 102


Energy consumption during the transmission of sending the ‘stop instruction’. For the conventional
RREQs can be saved by using the Blocking-ERS TTL based ERS, the energy consumption during the
scheme. Let the amount of energy consumption on route discovery process includes that in two stages:
each node for one broadcast is the same unit energy (a) searching for the route node and (b) return RREP.
consumption, denoted by UnitEnergy. Assume that The energy consumed for “(b) returning a RREP” is
each action of broadcasting a RREP, RREQ or ‘stop Hr UnitEnergy for both routing schemes and Hr
instruction’ consumes the same amount of 1 UnitEnergy is consumed for the Blocking-ERS stage
UnitEnergy. This can be easily shown by the ‘(c) sending the stop instruction’. In the stage of ‘(a)
difference of the energy consumption between the searching for the route node’, the energy
conventional TTL sequence-based ERS and the consumption is different for the two methods. Each
Blocking-ERS scheme. ring contains different number of nodes that
(i) One route case: - First consider only the energy rebroadcast to form the next ring. Let ni be the
consumption along the route from the source to the
route node. The energy consumption for the TTL number of nodes in ring i and the hop number of the
based ERS and for the Blocking-ERS can be route node be Hr.
described by the following formula (Eq. (1) and Eq. In the Blocking-ERS, the energy consumed in
(2)) respectively, where Hr is the hop number of the each ring is as below: -
route node. Ring i Energy Consumed
0 1
Hr 1 n1
ETTL − ERS = H r + ∑ i (UnitEnergy) (1) Hr-1 nhr-1
i =1 In the TTL based ERS, the energy consumed in
each ring is as follows: -
Ring i Energy Consumed
E Blocking − ERS = 3H r (UnitEnergy) (2)
0 1
1 1 + n1
The difference of the amount of energy Hr-1 1 + n1 + n2 + · · · + nhr-1
consumption is more visible from Fig. 2 where the Therefore, the total energy consumption by the
amounts of energy consumption by the two ERS Blocking- ERS is given by Eq. (3)
approaches are plotted against the number of rings.
The Blocking-ERS curve is below the TTL based Hr

ERS curve after ring 3. As it is clear from Fig. 2, the E Blocking − ERS = 2(1 + ∑ ni ) + E RREP (UnitEnergy ) (3)
i =1
difference of the amount of energy consumption
between these two mechanisms becomes larger as
the distance increases between the source and the Similarly, the total energy consumption by the
route node. conventional TTL sequence based ERS is given by
Eq. (4)

Hr i
E TTL − ERS = H r + ∑ ∑ n j + E RREP (UnitEnergy ) (4)
i −1 j =1

The difference between the E Blocking − ERS and


ETTl − ERS is given by Eq. (5)

H r −1 i
E Saved = H r − 2 + ∑ ((∑ n j ) − 2 ni )(UnitEnergy) (5)
i =1 j =1

Clearly, when ni = 1, for i = 1, · · · , Hr . The


above formulas represent the energy consumption for
Figure 2: Comparison of energy consumption for one a single route. This indicates that the energy
route (Energy is measured in Joule, and Distance in consumption saving achieved by the Blocking-ERS
number of hops) for a single route is the minimum amount of energy
saving.
(ii) General case: - Now consider the general case. (B) Time Delay
For the Blocking-ERS, the energy consumption Consider the time delay for the route discovery
during the route discovery process can be considered period, during which from the RREQ is broadcasted
as the total energy consumption in three stages: (a) for the first time, transmitted from the source node to
searching for the route node (b) return RREP and (c) the route node possibly via flooding. That is the total

UbiCC Journal - Volume 3 103


time taken from when the source node broadcasts the Hr
first RREQ for the first time until after a Route Node TBlocking − ERS = 3H r + 2∑ i (UnitTime ) (7)
is found and the source node receive the RREP from i =1
the Route Node. Let the UnitTime be the one-hop
transmission time, which is the time taken for a Compare this to the TTL sequence based ERS as
RREQ from a broadcasting node to one of its given in Eq. (8): -
neighbour nodes. In case of TTL sequence based
ERS suppose H = 3, that is the route node is 3 hops Hr
distant from the source node. The total time includes TTTL − ERS = 2∑ i (UnitTime ) (8)
the time when TTL = 1, 2 and 3. The final TTL i =1
number equals to the hop number of the route node.
This gives the following formula of total time delay It is clear that the difference between the two is
for the TTL sequence-based ERS (Eq. 6): - 3Hr three times of the hop serial number of the route
node, depending only on the distance between the
Hr source node to the Route node. The time delay in
TTTL − ERS = 2∑ i (UnitTime) (6) both approaches is compared in Fig. 3 As illustrated
i =1 the Blocking-ERS takes slightly more time than the
conventional TTL sequence-based ERS for the route
Now consider the time delay in the Blocking- discovery process.
ERS. The total time includes the time for three
stages: (a) searching for the route node, (b) returning 4 PERFORMANCE EVALUATION
the RREP and (c) broadcasting the ‘stop instruction’.
For stage (a), the time consists of the time to for To evaluate the performance of SEE-AODV
broadcasting and the waiting time. The broadcasting routing protocol (AODV-Clustering with Blocking-
time for 1 hop distance is 1 UnitTime. The waiting ERS technique), the simulation is done by using NS-
time depends on the hop number of the node. The 2 [6]. The performance of SEE-AODV routing
route node is 3 hops distant from the source node. protocol is evaluated by comparing it with existing
Each node needs to wait for 2H before AODV routing protocol in the same conditions.
rebroadcasting. At ring 1, the node waits for 2 × 1 = There is evaluation of three performance metrics: -
2 UnitTime, and at ring 2, the node waits for 2 × 2 = (i) Packet Delivery Fraction: - This is the fraction of
4 UnitTime, and at ring 3, the node waits for 2 × 3 = the data packets generated by the sources that are
6, so the total waiting time for the ‘(a) stage of delivered successfully to the destination. This
searching for the route node’ is 2 + 4 + 6 = 12, and evaluates the ability of the protocol to discover
the total time for stage (a) is 12+Hr = 12+3 = 15. The routes.
time for stage (b) and (c) are Hr and this gives 2Hr = (ii) Routing Load: - This is a ratio of control packet
2×3 = 6. Therefore, the total time for the route overhead to data packet overhead, measured by the
discovery and flooding control is 15 + 6 = 21 number of route control packets sent per data packet.
UnitTime. Mathematical formula is presented below, The transmission at each hop along the route was
where H r represents the hop number of a route node. counted as one transmission in the calculation.
(iii) Average Route Acquisition Latency: - This is
the average delay between the sending of a route
request packet by a source for discovering a route to
a destination and the receipt of the first
corresponding route reply. If there is a fresh route
already, 0 was used for calculating the latency.

Figure 3: Comparison of the time delay (Time is


measured in millisec, and Distance in number of
hops)

The formula of the time delay in the Blocking-


ERS is given by Eq. (7): -
Figure 4: AODV, SEE-AODV packet delivery

UbiCC Journal - Volume 3 104


fraction performed better than AODV in more “stressful”
situations (i.e. larger number of nodes, more load),
As shown in Fig. 4 the packet delivery fraction this is greatly contributed to the Blocking-ERS
obtained using SEE-AODV is almost identical to that technique of AODV-Clustering and reduction of the
obtained using AODV when node numbers are small. RREQ flooding.
However, when there are larger numbers of nodes
(i.e., more than 200), SEE-AODV performed better. 5 CONCLUSION AND FUTURE WORK
This suggests that SEE-AODV is highly effective in
discovering and maintaining routes for delivery of In this paper Scalable Energy Efficient AODV
data packets. (SEE-AODV) routing protocol has been introduced
to solve the scalability problem of AODV by
applying Clustering and to make it energy efficient
by using Blocking-ERS techniques in wireless mesh
networks. The performance is studied by simulations
based on NS-2. The result shows, that SEE-AODV
protocol achieves better scalability than existing
AODV while keeping the merits of it. The analysis
demonstrates a substantial improvement in energy
consumption that can be achieved by the Blocking-
ERS at a margin cost of slightly longer time.
There are some limitations still in SEE-AODV
routing protocol, such as during discovery of a route
the technique used in this protocol slightly takes
Figure 5: AODV, SEE-AODV Routing Load more time than existing conventional one, so it need
further study to improve this drawback.
Fig. 5 shows the routing load comparison of
these two protocols. Routing load is measured by 6 REFERENCES
numbers of route control packets sent per data
packets. For AODV, route control packets include [1] K. N. Ramachandran: On the Design and
RREQ, RREP, RERR and Hello messages, for SEE- Implementation of Infrastructure
AODV cluster related messages such as CH Mesh Networks, IEEE Wksp. Wireless Mesh
Broadcast Message, Join Cluster Request, Join Networks, Calcutta, pp. 78-95 (Sept 2005).
Cluster ACK, and Gateway message are also [2] R. Draves, J. Padhye, and B. Zill: Routing in
included. It is clear from Fig. 5, routing load of SEE- Multi-Radio, Multi-Hop Wireless Mesh
AODV was significantly lower than AODV when Networks, Mobile Communication, pp. 114–28
there are large numbers of nodes in network. So (Sept 2004).
SEE-AODV became a hierarchical protocol [3] R. Draves, J. Padhye, and B. Zill: Comparison
gradually, hierarchical routing greatly increases the of routing metrics for static multi-hop wireless
scalability of routing in wireless networks by networks, Proc. of SIGCOMM’04, (Portland,
increasing the robustness of routes. OR), pp. 12-24 (Aug 2004).
[4] I. F. Akyildiz and X. Wang: A Survey on
Wireless Mesh Networks, IEEE Commun.
Mag., vol. 43, no. 9, pp. S23–S30 (Sept 2005).
[5] V. Li, H. S. Park, and H. Oh: A Cluster-Label-
Based Mechanism for Backbone on Mobile Ad
Hoc Networks, The 4th Wired/Wireless Internet
Communications (WWIC 2006), pp.26-36 (May
2006).
[6] http://www.isi.edu/nsnam/ns/nsdocumentation.

Figure 6: AODV, SEE-AODV Average Route


Acquisition Latency

Fig. 6 shows the Average Route Acquisition


Latency of the two protocols. SEE- AODV

UbiCC Journal - Volume 3 105

You might also like