Information Delivrey p2p Network
Information Delivrey p2p Network
Information Delivrey p2p Network
82
Jani Kurhinen
JYVÄSKYLÄN YLIOPISTO
JYVÄSKYLÄ STUDIES IN COMPUTING 82
Jani Kurhinen
UNIVERSITY OF JYVÄSKYLÄ
JYVÄSKYLÄ 2007
Information Delivery in Mobile
Peer-to-Peer Networks
JYVÄSKYLÄ STUDIES IN COMPUTING 82
Jani Kurhinen
UNIVERSITY OF JYVÄSKYLÄ
JYVÄSKYLÄ 2007
Editors
Tommi Kärkkäinen
Department of Mathematical Information Technology, University of Jyväskylä
Irene Ylönen, Marja-Leena Tynkkynen
Publishing Unit, University Library of Jyväskylä
URN:ISBN:9789513930806
ISBN 978-951-39-3080-6 (PDF)
Kurhinen, Jani
Information Delivery in Mobile Peer-to-Peer Networks
Jyväskylä: University of Jyväskylä, 2007, 46 p.(+included articles)
(Jyväskylä Studies in Computing
ISSN 1456-5390; 82)
ISBN 978-951-39-3021-9
ISBN 978-951-39-3080-6 (PDF), 978-951-39-3021-9 (nid.)
Finnish summary
Diss.
The work aiming at doctoral degree is done for myself. Of course I couldn’t
have done it only by myself, but several persons have given their contribution
in their own way. My sister Maria showed me the path, and it has been easy
to follow her steps to the academic world. Professor Hämäläinen took me to
his team and gave me all the resources I ever needed. Jarkko Vuori was there to
guide me when I was taking my first steps as a post graduate student. In addition
to Jarkko, Vesa, Mikko, Matthieu, Jukka, and Niko helped me publish my ideas.
Professor Ristaniemi provided me with valuable support in composing this book,
even though I was stubborn and wanted to do some things my way. But most of
all, I express my graditude to my wife Sanna and to my parents who really made
all this possible.
In addition to my salary from the university of Jyväskylä, I have received
financial support from Ellen and Artturi Nyyssönen foundation, Ulla Tuominen foun-
dation, Jenny and Antti Wihuri foundation, and Nokia foundation. Thank you for
believing in me.
ABSTRACT
ACKNOWLEDGEMENTS
ACRONYMS
LIST OF FIGURES
CONTENTS
LIST OF INCLUDED ARTICLES
1 INTRODUCTION ............................................................................ 13
PIV J. Kurhinen. MP2P Network in Collecting Data from Delay Tolerant Sen-
sor Networks. In Proceedings of the 11th IEEE Symposium on Computers and
Communications (ISCC 2006), pages 246-250, 2006.
The author has participated in the writing process of all the listed publications.
Below is a short description of the articles and the author’s contribution on the
content of the papers.
Publication PI introduces a mobile learning environment, in which content
can be shared between mobile terminals using short-range connections. The au-
thor introduced the idea of using Mobile Chedar middleware in this context. In
publication PII the author presented the original idea of focusing on the informa-
tion diffusion process instead of the mobility of network nodes. The paper intro-
duces the exchange pipe model to describe the information diffusion process. In
publication PIII the author presented additional results which were found when
studying information diffusion using the exchange pipe model, and especially
the effect of utilizing MP2M technology. In publication PIV the idea of mod-
elling information diffusion with the exchange pipe model was reversed in order
to describe information collection from several sources to one sink node. The
simulation environment used in publications PII-PIV was designed and imple-
mented by the author. Publication PV introduces a concept of mobile encounter
networks. The author defined the mathematical model for MP2P encounters,
and thus provided the encounter networks with a formal definition. Publication
PVII continues discussing mobile encounter networks. The author provided the
idea of using different mobility models to describe behaviour of mobile nodes
in different parts of the simulation environment. The author also participated
in planning the simulation settings and evaluating the results. Publication PVI
and PVIII discuss MP2P nodes as delay tolerant networks. In PVI the author
introduced the idea of smart messages which route themselves by monitoring
movement of mobile data carriers. In PVIII the author defined the framework
for the paper and planned requirements for the simulations.
ACRONYMS
C/S Client/Server
C2C Car-to-Car
C2I Car-to-Infrastructure
DTN Delay-Tolerant Network
MANET Mobile Ad Hoc Network
MEN Mobile Encounter Network
MP2P Mobile Peer-to-Peer
P2P Peer-to-Peer
PDA Personal Digital Assistant
PIM Personal Information Manager
VANET Vehicular Ad Hoc Network
V2R Vehicle-to-Roadside
V2V Vehicle-to-Vehicle
1 INTRODUCTION
The way machines communicate with each other can be modelled with three
communication paradigms: Client/Server, Push and Peer-to-Peer. Before contin-
uing with mobile peer-to-peer networking, all the paradigms are briefly intro-
duced.
Client/server (C/S)is the traditional way of data communication. In this
model a server possesses some data elements that a client is interested in. In
order to acquire the desired elements, the client sends a request to the server. The
server reacts to the request with a response in which it delivers the requested data,
or some other information if the requested data is undeliverable. Therefore, we
can say that this paradigm is a reactive communication method.
Push model is the second of the older and traditionally used methods. Simi-
larly to the C/S model, it also has a client and a server device, but in this method,
as the name implies, the data is sent without any particular request, i.e. it is
pushed and this is proactive communication method. When comparing the actual
data delivery between C/S and push models, one of the biggest issues is related
to the targeting of data to the client. In the C/S model the client sends a request
and includes information where to deliver the response. The push method, on
the other hand, must rely on some other mechanisms when setting the target
information: it can use, for example, broadcasting or previously stored address
information. The data reception is, however, prone to failures, since there are less
guarantees compared to the C/S model that there actually is a client device ready
to receive a transmission.
With both C/S and push models there is a specialized data container unit
that provides data to those wishing to consume it. The peer-to-peer (P2P) model,
on the other hand, consists of a set of machines who can all act either as a data
provider or data consumer. At a micro level, where one observes only one com-
munication transaction, a typical P2P communication resembles the C/S model: a
data consumer requests content from a data provider who returns the applicable
data, and thus this is also a reactive communication method. On the other hand,
if someone within the P2P communication system wishes to spread information
it possesses, it can follow the push model. The ideological difference is that there
16
Pure Peer-to-Peer
Client/Server Request
Response
Request
Request Response
Request
Response
Response
Push
Hybrid Peer-to-Peer
Push
II
Response
are no particular data providers or data consumers, but all the networking nodes
might be able to act as such. Therefore the majority of the networking nodes are
no longer passive information consumers, but they have become active parts that
can respond to other nodes’ needs [45].
In fact, the dualistic role of the nodes is the main reason for public interest
in the P2P systems. The Peer-to-peer network can harness resources scattered
over the network environment much better than any of the traditional models.
Parmeswaren et al. [45] makes a note that with P2P system the network is no
longer only a mandatory functional component, but it is a source of resources,
and P2P systems concentrate on utilizing those resources. Network users have
applied the peer-to-peer networking paradigm eagerly, but they have also found
the weakness of the system’s distributed nature. The plenitude of data stored all
around the network makes it difficult to find some particular piece of data. In a
C/S system, there is a known source of data to which a request must be sent, but
the P2P paradigm does not have such a single source. In order to create a solution
to the data location problem, there have been proposals for the technology that is
something between C/S and P2P.
Since the P2P technology is relatively young, the terminology is not yet fully
stabilized. Nakamura et al. [39] and Oh-ishi et al. [42] uses the terms pure P2P
and hybrid P2P to describe the different approaches. On the other hand, Walker-
dine et al. [57] call them fully distributed and semi-centralized respectively. The
pure or distributed peer-to-peer system only contains equal network nodes, and
follows the P2P paradigm presented above. In the hybrid or semi-centralized
P2P network there are also some special nodes which provide indexing services.
17
When the actual peers inform the special nodes that they have some resource to
be shared, it is easy for a data consumer to search the content from the indexing
server and then request the data from its provider. Figure 1 presents a summary
of the communication paradigms.
In the 1970’s the US Defense Advanced Research Project Agency (DARPA), US army
and US Office of Naval Research (ONR) created a technology in which mobile troops
were not dependent on one single command unit, but units were able to commu-
nicate with each other without any predefined chain of communication. The tech-
nology was called Mobile Packet Radio Networking and it was defined as a collec-
tion of mobile platforms forming a distributed network which can operate with-
out any help of other infrastructure except the units themselves [32]. Today the
technology is called Mobile Ad Hoc Networking (MANET).
The principle idea behind MANET lies in multipurpose network nodes which
can communicate with each other. As stated above, this communication must
take place without any external help: This includes a situation where two com-
municating nodes do not have a direct radio link with each other. In order to
be able to communicate without the direct link, the data transmission must be
delivered via other intermediate nodes (Fig. 2).
A G
B
E
C D
As can be seen from Figure 2, the message that node A sends to node F, must be
delivered via several other network nodes which do not participate in the actual
communication. In the figure the message was routed via nodes B,C,D and E,
while the shortest, and possibly the optimal, path would have been via nodes B
and G. The reason for making the decision to use the longer route may have been
18
2.2.1 Communication
This thesis discusses information delivery within a MP2P system, and it focuses
on two approaches: 1) information diffusion from one source to a set of MP2P
nodes, 2) information collection from several sources to one target using MP2P
nodes as carriers. As was previously pointed out routing through intermediate
nodes within such a dynamic system is challenging. When observing data trans-
mission more closely, we can find two opposite cases: reactive, on demand data
transmission and proactive data transmission which actively participates in the
20
Transmission type
Communication method Single hop Multi hop
Reactive Query from a Routing in a
known source dynamic environment
Proactive Data exchange between Routing in a
encountering nodes stable environment
hop transmission when a node looks for some certain piece of data, but does not
know the source for it or is not able to move to the source location. In this case
the only possible solution is to request the data from other nodes.
2.2.2 Routing
The previous section introduced the single hop communication, which can sim-
plify the situation in some cases. However, it does not suite all the communica-
tion systems. The single hop method can only be used with applications that can
tolerate certain delays. In general these types of systems are called delay tolerant
networks, DTN [17]. On the other hand, DTNs can not only rely on direct node
to node data delivery, but in many cases they can benefit from multi hop routing.
Data MULEs [51] is one of the very first concepts describing this kind of environ-
ment, in which data was routed using several independent mobile carriers. Our
studies in PIV, PVI and PVIII discuss similar network systems, where data is
collected from several sources to one data sink. The data was collected and trans-
ported by mobile entities that were already moving within the environment, and
therefore the delivery would not cause additional costs. The multi hop transmis-
sion was, indeed, the most practical approach in this case: Instead of giving full
responsibility to one of these mobile entities to deliver the data packet to its target
location, it was possible to pass the data to another unit which, in turn, might be
able to transmit the data to the actual receiver.
The multi hop routing algorithms can be divided into three classes: 1) Epi-
demic spreading, 2) Epidemic spreading with limitations or restrictions, and 3)
targeted data delivery. Epidemic routing was introduced by Vahdat and Becker
[56]. As the name implies the algorithm works like a disease: using epidemic
routing, messages are passed to all possible network nodes in the hope that some-
one is able to deliver it to a target location. In fact, it is a really powerful method
and always gives the smallest delay if the network system handles the data flow
properly. The same thing that makes epidemic routing so efficient also makes it
extremely heavy on the system. While copying the messages to other network
nodes, the epidemic algorithm wastes plenty of system’s resources like storage
capacity, radio time and power. As was pointed out in PVIII, for example, rela-
tively soon after the release of the data, the whole network population is infected
by it.
The second class of multi hop routing aims at reducing the disadvantages of
epidemic spreading by setting some kind of limitations to the number of copies
or trying to stop the spread as soon as the data has found its way to the receiver.
Harras et al. [19] has studied different approaches to limit epidemic diffusion by
controlling the number of retransmissions and preventing epidemic spread with
acknowledgement messages they called a cure. The message receiver sends a
cure when the first copy arrives. Cures heal the epidemic by erasing the original
messages as they spread into the network. A similar idea was presented in [22]. A
request message was sent using epidemic spreading, but the response traffic was
limited. Moreover, Spyropoulos et al. [52] has proposed Spray and Wait and later
22
Spray and Focus [54] protocols, which are other methods to limit the epidemic
diffusion. Spray and Wait exploits the different types of counters to control the
number of copies in the network. Spray and Focus is the second evolution ver-
sion, which combines copying and forwarding. These schemes, however, do not
make the effort of selecting between distinct nodes while passing on message
copies. Instead, they employ numerous randomly selected nodes as message car-
riers, although the Spray and Focus protocol tries during the focusing phase to
take advantage of potential opportunities to forward the message closer to its
destination.
The third class of multi hop routing protocols can be described as being
more intelligent than the two previous classes. Opposite to random spreading
these methods focus on selecting an appropriate carrier node among the con-
tacted nodes, and to keep the messages themselves if there are no better carriers.
An obvious method to route a message is to send it via nodes which form the
shortest path. This would, however, require information about the topology of
the network. As mentioned before, information about the topology, and there-
fore the optimal path, can be obtained reactively or proactively depending on the
application scenario. In either way, both the sender and the intermediate nodes
must know the route to the target, and this assumption is not always applicable.
In some cases, like in city centres or on main roads, the possible message carriers,
i.e. the networking environment, is extremely dynamic, and there might not exist
any path between a sender and receiver that would be stable enough to enable
data transmission if a valid path would be required.
One promising set of intelligent methods are location-based routing proto-
cols. These protocols use geographic or direction information of both the mobile
carriers and target location. Chen et al. [13] has proposed one idea in which
they couple a store-carry-forward paradigm to localized geographical routing
schemes. In their optimistic forwarding strategy, as they call it (also called oppor-
tunistic forwarding [29][59]), a message is forwarded closer to its geographical
destination by selecting a node which has the smallest distance between a node’s
and the recipient’s geographic location. The protocol selects the message carrier
from all those nodes, which can be reached with one epidemic multi-hop query.
Thus, it finds the closest node from a connected cloud regardless of the cloud’s
topology.
Another method, MDDV, was introduced by Wu et al. [59]. The idea was to
disseminate messages between sender and receiver along a defined trajectory of
a mobile unit. When compared to [13], this method takes advantage of different
types of location information. In MDDV, nodes advertise the last known loca-
tion of the most forwarded copy and the message carriers nearby that location
form a group, which actively disseminates messages to encountered nodes. In
addition, the protocols in [29] (MoVe) and PVIII (PGR) use not only the location,
but also the motion of the carrier nodes. MoVe algorithm, relays messages to a
node whose motion vector points closer to the message’s destination. PGR, on
the other hand, combines direction and the known trajectories (i.e. predictable
mobility) of the nodes, and therefore it can avoid some of the problems of both
23
Sensory nodes
Central node B
This is the case in Figure 3. It depicts this traditional way of collecting sensor sys-
tem data. The smaller units representing the sensor nodes gathering information
from their environment and sending it via routes represented by the arrows to the
central node. When using the topology depicted in the figure, the intermediate
sensor nodes A and B not only gather information themselves but also transmit
all the information they get from other sensor nodes to the central node. This
leads to the question that is maybe the most important research problem in the
context of sensor networks: How to save energy to retain the sensors functioning
for a long period of time without external power and maintenance?
Depending on the application scenario, both short- and long-range wireless
communication links are used in inter-sensor communication [28], but only sel-
dom is mere short-range communication sufficient. Long-range communication
is, however, expensive in terms of energy consumption. It would be more effi-
cient, if low-power short-range connectivity could be used, especially while the
nodes should be able to function independently with many practical applications
for long periods of time. This was a part of the research problem on PIV. In the
paper it was noted that taking advantage of short-range connectivity and mo-
bile data carriers, we can omit the requirement of having an energy consuming
long range radio link. It was pointed out that transmitting a small number of
copies with a short-range radio link to nodes visiting the communication range
was enough to provide high probability of successful data delivery.
In the previous sections we discussed how the network nodes communicate and
how it is possible to deliver messages from a sender to a receiver. We have used
25
terms like entity or unit as representatives of those nodes. However, we have not
yet defined who or what the actual mobile nodes could be. The purpose of this
section is to lighten up the background for the question that why do we have this
particular field of research.
A paper written by Arai et al. [2] is one of the very first publications which
talks about information diffusion within a system that closely resembles MP2P
community. Their paper is a study of a set of mobile robots which have some
tasks to perform. The robots receive their instructions from fixed locations while
moving in the area. After having received the instructions, a robot is able to for-
ward it to another robot using local communication.The system is further analyzed
in [64].
The robots, even though they might be used for several different types of
tasks, they are not the real reason for MP2P research. Ludford et al. has written
a paper [31], in which they have spelt out a phrase that well characterizes the
background: Because I carry my cell phone anyway. The rise of mobile computing
started in the early 90s, but at that time the main purpose of those devices was
in personal information management (PIM) or in pure (voice) communication. It
was the Palm phenomenon, when the Palm Pilot’s markets grew faster than any
other computing product in history[38], and the true mobile phones, which were
small enough to be carried, that gave an initial impulse to the development that
has created a need to study mobile computing as part of our everyday life. Fur-
thermore, as the devices, and the technology behind them, matured, the portable
communication machines became something more than just PIM or communica-
tion terminals. PI discusses an application scenario where personal mobile hand-
held devices could be used as a tool for co-operative classroom working.
Human carried devices are, in fact, one of the most important types of an
MP2P system. A pocket Switched Network concept [20] was introduced to char-
acterize a networking environment that would be based on a combination of hu-
man mobility, mobile computing devices and local wireless networking. In ad-
dition to pocket-sized computers that users are able to carry with them all the
time, there is another technology that is also related to humans and is more ubiq-
uitous. Wearable computing, i.e. smart electronics embedded, for example, in
clothes, boots or eye glasses, is perhaps not yet a mainstream technology of to-
day, but it is likely to be important in the near future. A proposed application
with which wearable computing devices were harnessed to deliver data can be
found, for example, in [15].
In addition to human carried computing devices the transformation of mo-
bile phones and PDAs into powerful mobile computers and increasing the pres-
ence of powerful embedded devices have inspired, for example in [67], to take
advantage of this new type of computing resource. In the paper, the researchers
present an application for taxi reservation using local short-range connections,
mobile handsets and ad hoc networking. The application, therefore, combines
automobiles and humans to form one mobile ad hoc network.
An ad hoc network, which is formed of different types of vehicles, is another
important set of MP2P network nodes. This kind of network has its own acronym
26
When studying a system like a mobile peer-to-peer network, often the first thing
to do is to select a node mobility model best suited for the particular case. This
approach is a good one if the aim of the study is to observe behaviour of nodes.
However, this is not always the case, and another common target of research is
inter-node communication. If the only target of the study is to estimate commu-
nication and data transmission processes, a simulation of the node mobility does
not provide any added value if there is a suitable model for the communication
process. This chapter discusses the both, modelling of the mobility and mod-
elling of the communication. The chapter also presents the main contribution of
this doctoral thesis and their relationship to existing studies.
3.1 Mobility
brid model would be a synthetic one, but it would be defined based on real world
observations and not only on statistics. A similar classification is presented in [4],
where a hybrid model was defined to have deterministic behaviour in either a
spatial or temporal domain.
In this thesis the model classification is furthermore refined to consist of
two major branches: random mobility and trace based mobility. Random mobility is
statistical by nature: with these models a node selects a random target location,
and finds its way there. If there are obstacles or rules for the movement, they
must be respected. Therefore, even while the paths that the nodes move along
are selected randomly mobility is not necessary totally random. The trace based
mobility, on the other hand, relies on real paths that real world entities could
actually use. The paths, or the traces, of the nodes are based on the observation of
a real world scenario, but they do not have to be exactly the same. The simulation
settings can contain different synthetic forms of the node trajectories with varying
mobility parameters, like velocity and decision making at crossroads.
Abstract mathematical models are invaluable tools, because they can provide re-
peatable results and the possibility to adjust parameters to control output. They
do not try to depict accurately the real world at an entity level, but they aim at
simulating the system. Therefore, they can be used to achieve statistically sound
results even though a mobility pattern of a single node might seem absurd.
The random walk mobility model is the simplest of the used mobility models.
It mimics the Brownian motion where molecules freely and randomly vibrate
in gas or liquid. An important characteristic of the random walk is that within
a 2-dimensional space the node will return to its starting point with complete
certainty. This feature can be used in scenarios where the mobility should take its
place around the starting point and not to escape from the simulation space [8].
The Random waypoint model provides a small addition to the simple random
walk: pause times between changes in speed or direction. With this feature the
model reaches closer to a real world scenario by letting the nodes stand still. With
the random walk model, the nodes are in constant motion. Due to the fact that
the only meaningful differences between these two models are the pause times,
the above mentioned characteristics and application scenarios are also valid for
the random waypoint model. Figure 4 depicts an example path of a mobile node
created either with the random walk or the random waypoint model. This model
is the most popular when doing simulations with ad hoc networks. For exam-
ple, Yoon et al. [61] noted that nine out of ten node mobility simulations in ACM
MobiHoc2002 conference used this model. At the same time they were also wor-
ried about the behaviour of the model: It will take a lot of time for the model to
achieve a state where average node mobility would be steady. They have shown
that the model suffers from average speed decay, which will affect the results be-
cause the environment setting will vary. The problem is caused by an inevitable
long slow duration velocity paths, during which the slow moving nodes skew
29
the average speed of the system. By selecting a non-zero minimum speed will
help reach a steady state earlier, but the problem will still exist.
The same research group, which already reported the speed decay prob-
lem of the random waypoint model, continued studying the phenomenon, and
noted that any random mobility model that chooses its speed and destination in-
dependently from each other suffers from the same problem [62]. Yet another,
already widely known problem of the above mentioned models, is an uneven
spatial node distribution. This problem is caused by the fact that random se-
lection of target positions using a uniform distribution of coordinates makes the
nodes spend more time near the centre of the simulation space [8]. To achieve a
more even nodal distribution Royer et al. modified the random waypoint model
and created the random direction model [48]. Instead of selecting a target posi-
tion, in this model a node selects a random direction, to which it continues until
it reaches the boundary of the simulation area, and in this way it utilizes more
areas near the edges compared to the other two older models.
As illustrated in Figure 5 the path of the node uses the simulation space more
equally compared to the random walk or random waypoint models. Together
30
with the random waypoint model, the random walk and the random direction
models form the top three of the most popular mobility models used to study ad
hoc networks [65].
In addition to the above mentioned three popular models, there are a num-
ber of other models which have the same basic setting: There is a simulation space
(usually rectangular in shape) in which nodes freely move and change speeds
and directions according to some algorithm. The models include additional char-
acteristics like nodes appearing from the other side of the space when they hit
the edge, avoiding sharp turns by remembering the old direction or selecting di-
rections that would be the most probable [8]. A more noteworthy modification
is the presence of objects that limit free movement. These kinds of models are,
for example, city section [8], Obstacle [23], and freeway and Manhattan [3] mobility
models. With these models the mobility is still random, but it is restricted to fol-
low some pre-defined routes and to avoid some areas. The effect of such rules is
an increased number of average neighbours and the encounter of other nodes.
To combine the advantages of both free and limited mobility models a new
type of model was introduced in PV. The model was created according to an
observation that node density of ad hoc network systems will vary in time and
in space [26]. In the model two different characteristics were defined into one
single model. We used the city section model to depict areas where a great num-
ber of nodes are located, and the random way point model to depict sparser ar-
eas. Nodes were able to change their position between the areas with a certain
probability. PVII gives a more detailed specification of the model and the used
parameters.
A completely different mobility model, the pipe model, was defined in PII.
The pipe model was targeted on communication research using short range wire-
less communication, and it is even more statistical by nature than the existing
random mobility based models. As was earlier pointed out, the traditional way
of studying ad hoc networks has a tendency to always focus on modelling the
movement patterns of mobile nodes. However, a MP2P network with a very
limited communication range can not be expected to work, unless an adequate
number of nodes are located in a limited area at a given moment. When this
assumption is combined with the notion voiced in [26] that mobile nodes are typ-
ically packed in certain areas, it can be justified to change the focus of the study
from separate mobile nodes to defined observation areas.
Figure 6 depicts the pipe model. It focuses exactly on these relevant areas where
information can be exchanged due to high node density. A simulation space con-
sists of one or more pipes and a set of nodes. The core of the model is a pipe
31
through which the nodes can travel. When a node enters the pipe, it has the pos-
sibility to interact with other nodes. Outside the pipe there is no interaction, i.e. a
node can not obtain anything that would make a change in its state, and therefore
no computational resources are needed to process them. The main characteristics
of the pipe model are the possibilities for a chain reaction and pipes with differ-
ent parameters (size, capable of interaction or not, number and speeds of node
flows). The nodes are randomly selected to enter the pipe, and the implementa-
tion in PII made it possible to bias the selection in order to favour some subsets
of the nodes. This feature reflects the behaviour of different types of nodes, from
which some are more active in their movement than others.
The second set of mobility models in the scope of this thesis are models which
are based on pre-defined paths that the nodes must follow. The paths can be
synthetic or real traces, but even the synthetic ones are based on realistic mobility.
The movement of the nodes along the paths, however, can be defined based on
statistics. The nodes might change their speed or direction and they may need
to make choices between possible routes. The selection of the next target and the
pause times between two trips are adjustable as well.
This set of models provides the researcher with an accurate knowledge of
events within a simulated system. Even though the result gained with these types
of models should reflect the real world scenario with a greater reliability com-
pared to the statistical models, they do not suite all the cases because of four
main reasons [37]: 1) There is very limited amount of data available to generate
the traces. 2) The models that depict their environment with great detail are not
well suited for general use, because the achieved results reflect the particular en-
vironment. 3) Sensitivity analysis is not possible with fixed parameters. 4) These
models are hard or impossible to be defined mathematically, which might be an
important aspect in some type of analysis.
As noted above the models of this type do not usually have an exact al-
gorithm according to which the mobility is generated. Research in this divides
into two parts: how to obtain traces and how to make studies with them. Ad-
ditionally, at some times researchers mix the concept of the mobility model with
something else. In [40], for example, the authors introduce a "new mobility model
for VANET". According to the paper, the new model is, in fact, merely a method
where they use input from an external traffic simulator to a general purpose sim-
ulator (NS-2) that they have programmed for inter vehicular routing. The method
itself is usable, but it is not a mobility model. Actually, the trace-driven simula-
tions can be performed without any model just by using the recorded traces [27].
Compared to the previous example, the authors of [33] are much more care-
ful with their words. They present a "method to generate a mobility scenario",
while the method could be called a mobility model. The proposed method uses
real world observations of node mobility as a starting point and generates imag-
inary routes that are sound in the sense that a real world node could travel along
32
them. In their paper Kim et al. [27] use a similar method. In order to define a new
mobility model they emphasize the importance of understanding the mobility as
a process, and the need of being able to obtain detailed mobility data about real
users and carefully characterize their mobility. When analyzing the mobility data,
they searched regularities on characteristics like pause times, speed distribution
and direction of movement. According to the result, the particular source ma-
terial revealed mathematical regularities on pause times and speed distribution,
but only context-related regularities on direction of movement.
A model based on student movement in a campus area was presented in
[5]. As with the model in [27], the authors were also able to provide an analytical
description for their model, although much more simple that was focusing only
on probable directions. In addition to analyze fine-grained data collected from
individual people to create a mobility model, Yoon et al. [63] introduce a method
for refining coarse-grained data into a mobility model. They designed a model
which combined geographical data (maps) with user information from wireless
access points. The topology information from the maps was used to form user
paths from the wireless access point logs using heuristics based on distance. The
authors claimed that their model can be used to predict user densities on certain
areas and probabilities for a user to follow certain routes, although the results did
not provide 100 per cent accuracy. A pure trace-based model was introduced in
[35], which also used a campus area as their environment. The authors criticized
synthetic mobility models for not reflecting the real world characteristics. How-
ever, at the same time they were not able to see the down side of a "too realistic"
model: there was no formal definition provided, the model setting was tightly re-
lated to the particular environment, and therefore, the environment parameters
were not adjustable.
The location based mobility model used in PVI and PVIII is not similar than
the others in a sense that it is merely a method for selecting the next target loca-
tion. The method itself does not try to model the environment, but it only focuses
on node behaviour. It can and should be used in conjunction with other mod-
els that generate the actual traces and important locations. In the location-based
mobility model, each node has an individual location pattern containing several
positions that the node visits frequently. The movement of the nodes takes place
in a graph representing a road map, where nodes roam between their own reg-
ular visits or randomly selected locations which consists of a few locations rep-
resenting shops and other favoured visiting areas which are preferred in a nodes
location pattern.
3.2 Communication
When ad hoc and mobile peer-to-peer networking become attractive to the re-
search community, most of the resources were targeted on overcoming the prob-
lems caused by mobility. Today the situation is different: mobility has been un-
33
The most obvious requisite for inter node communication is that they can estab-
lish a communication link. In the context of this thesis, the communication link is
restricted to a short-range local communication. With this restriction in mind, we
can say that node encounters are the only relevant events when studying com-
munication in MP2P systems. It is especially important to analyze the frequency
and duration of contacts between the carriers of communication devices [9].
This problem, however, has been studied only a little. PV is among the first
published studies to describe the mathematical characteristics of node encoun-
ters. The paper discusses simulation settings that are aimed at finding regulari-
ties in node encounters. The dual mobility model was used to simulate mobility
of the nodes, and the result was that successive encounters of two mobile nodes
resemble gamma distribution. In [9] the authors have real data about human
encounters. Their analysis revealed that power law distribution was the best ap-
proximation of the observed behaviour. They noted, however, two limitations
in applying the formal method: a granularity of observations (i.e. how often the
state of the system was recorded) limits the accuracy and long encounter intervals
(more than one day) differed considerably from the observations. Another issue
regarding the power law distribution is that the distribution function gets high
output values with small values of the variable. Therefore, if such a distribution is
used, one must be very careful when setting a domain for the observation. Even
though our simulations in PV resulted in contact periods to follow the gamma
distribution, the presented mathematical model was not tied to any particular
distribution. In fact, the distribution of contacts characterizes the environment
and therefore should be selected based on the best knowledge.
Neither of the above mentioned studies included duration of the encoun-
ters, but only a number and intervals. However, the authors of [12] remarked
that these values do not say anything about how long each contact will last, and
therefore they do not provide enough information about the possibility of ac-
34
tual data transfer. In addition, they reminded that if there is human behaviour
involved, the dependency of contact should be taken into consideration. This
type of social relations were used for an interaction study in [37]. In the paper
the authors talk about the community based mobility model that takes human
behaviour into account. They defined an interaction matrix to describe social
relations of the community including the notion that social networks are time de-
pendent, and can vary depending on the time of day or day of the week. With
the help of the interaction matrix they generated a connection matrix, which was
used to define whether the two nodes had connectivity or not. As a conclusion
they noted that the introduced model reproduced contact information that has
characteristics similar to real world observations, but they did not provide a for-
mal definition.
Mobility models can be used to simulate node encounters which, in turn, are
basic elements for studying data spread. Information diffusion among mobile
nodes within a mobile community was a starting point for a set of studies that
culminate in this doctoral thesis. It has been studied earlier by, for example, Arai
et al. [2] and Khelil et al. [26]. Both of these studies used some free random mobil-
ity model to generate a communication event between the network nodes. When
studying information diffusion in a system, in addition to the model for mobil-
ity of the network nodes, there is also a requirement for modelling the informa-
tion exchange process, i.e. the way a piece of information is transferred between
nodes. One possible, and often used, model is an epidemic diffusion model. Al-
though, the epidemic model is widely used in various areas of research, its main
problem is its tight relation with random mobility models: it is assumed that an
object which carries the ’infection’ will meet other objects in a random pattern.
When using this kind of model, the number of infected users as a function of
time will have a general S-shaped form of the logistic function, like the ones from
Arai et al. and Khelil et al. The same type of results have also been presented in
a more recent paper [44], which discusses the spreading of computer viruses in a
mobile ad hoc environment. The nature of the results are based on the fact that
with only random mobility it requires ’critical mass on infected objects’ for the in-
formation to spread quickly. Therefore at the beginning of the diffusion process
the information spread is rather slow until it explodes.
As an opposite to this traditional model, we claimed in PII that informa-
tion can be expected to spread faster immediately when it is released into an area
where there is a high density of possible new carriers of the information. The re-
sults were achieved while using the pipe mobility model discussed in the Section
3.1.1. The results of similar nature was presented in [41]. In the paper a worm
epidemic was modelled in a VANET environment. Many of the characteristics
of the particular study were similar to our information diffusion study with the
pipe model. The VANET environment that utilizes a road network provides a
similar communication pattern between the nodes than our pipe model. In addi-
35
tion, the worm epidemic was released into a population dense enough. Therefore
the spread of data could start from the beginning with full power.
In addition to mobile units only, Arai et al. [2] also included stationary
information sources into their study. These source points were fixed locations
where the spreading information originated. Similar extensions were also used
in a study by Papadopouli et al. [43]. Although Papadopuli’s study was not about
information diffusion in a MP2P network, they also presented the idea of the need
for stationary components to complement mobility models. In other words, one
must take in account all the elements that might affect the result whether they are
mobile or not. This and an assumption that not all of the mobile nodes are able to
participate in an information diffusion process were implemented in PII where
we provided a mathematical expression to the results.
When observing the simulation results more specifically we noticed that the
curve of the simulation results resembled function
n(t) = P − Pe−kt , (1)
but it was not as smooth, and the parameters P and k could not be fitted to suite
that function. However, from the output of the simulator one can separate three
different factors: 1) The nodes that do not have technology for MP2P, and there-
fore are able to receive information only from the information source. 2) The
nodes that have received their data from another node inside an exchange pipe.
3) The nodes which have the required technology, but received the information
from the information source. The nodes who form the third factor act as a car-
rier and therefore they are the ones who start the explosive information diffusion
process at the exchange pipe. By drawing the curves of these three factors sepa-
rately we can see that each factor draws a curve that is in the form of (1) and it
is possible to fit the parameters P and k to follow the simulation results. It can
thus be argued that information diffusion in a short range mobile peer-to-peer
network can be modelled with function N(t) as a sum of (1) with three different
parameters:
3 3
N (t) = ∑ ni (t) = ∑ Pi − Pi e−ki t , (2)
i =1 i =1
in which Pi is the number of nodes in the group i and k i are constants which
values depends on environmental setting. The presented equation (2) provides a
tool for predicting the data spread within a mobile community without running
actual simulations. The only requirements are numerical values for the constants
of the equation.
The model for information diffusion presented above is valid only when
the node mobility of a system can be modelled using the pipe model. The pipe
model, on the other hand, was designed to reflect city centers and other highly
crowded places. In PV we introduced a concept of Mobile Encounter Networks,
MEN, which is more general and can be used for environments with another type
of a mobility pattern. The formal definition of the concept provides us with a tool
for taking advantage of node encounters in order to model information diffusion.
The definition is presented below:
36
to indicate the possibility of nodes communicating with each other so that value
1 of the discrete and binary-valued function
Therefore, we can say that information diffusion taking place at time t will cause
a change in the vector B, so that
n
bi ( t + 1 ) = bi ( t ) f (t)i,j b j(t) , ∀ i ∈ [1, 2, . . . n] . (7)
j =1
then
0, b̃i = 0,
bi ( t + 1 ) = ∀ i ∈ [1, 2, . . . n] . (9)
1, b̃i > 1,
A set of functions (5), which are elements of matrix F, characterizes the in-
formation diffusion process with non zero values denoting a possibility to data
exchange. In order to give a formulation to the data exchange possibilities, let us
define
tk,i,j (10)
to represent the time when node i encounters node j for the kth time. Thus we can
say that function (5) is defined as follows
1, t ∈ T = t1,i,j , t2,i,j , . . . , tm,i,j , . . .
f i,j (t) = , (11)
0, t ∈ T
In the thesis there were several contributions which considered information dif-
fusion from one source to a population of mobile nodes, information collection
from several sources to one target using mobile nodes as carriers, and informa-
tion routing from one source to one destination using (several different) mobile
carriers.
Let us consider the information diffusion function (2). The shape of the
function is similar compared to heat conductivity in physics. The characteristics
of these two phenomena share common features. In both cases there are two sets
that are originally in different states and the system attempts to stabilize itself by
letting particles move from a higher concentration to a lower one. The soundness
of the results is further verified when compared to the results in [41]. Their simu-
lation settings had similar characteristics and results were also analogous. Their
analysis concluded that the data spread was linear when it was at its maximum.
Let us now take the first order derivative of (2). As a result we will achieve a
monotonic decreasing function:
dN (t) 3
= ∑ Pi k i e−ki t . (13)
dt i =1
dN (t) 3
lim = ∑ Pi k i , (14)
t →0+ dt i =1
we will see that it results in a constant value, i.e. the Function (2) has a maximum
of linear growth which depends on the population and other environmental con-
ditions.
The formal description of MEN does not provide any values or functions,
but only an instruction in how to define the encounter matrix (4). By knowing
38
the distribution of node encounters within a system, the element functions of the
encounter matrix should be set so that the conditions of (11) and (12) are fulfilled.
If there is not enough information available of the MP2P system, the only
solution to study communication is to simulate mobility of the nodes. Both the
statistical and deterministic models have their pros and cons. On the one hand,
deterministic models provide knowledge of the behaviour of the nodes. On the
other hand, they are suited only for that particular or very similar environment.
When using the statistical models, the best they can provide is an approximation
of the results. However, the resulted approximation might be valid for a number
of environments.
The third part of the major contribution of the thesis was the idea of letting
smart messages route themselves. This concept provides dual layer mobility: At
the lower layer the message carrier is mobile, and the message can take advantage
of the movement of its host. At the higher layer, the smart message is also mobile.
It will try to search for a new host, if the current one can not offer transportation
service to a required location. As the method of describing information diffusion
in MEN, the concept of self routing smart messages is also independent of the
details. For example, any of the routing methods presented in Section 2.2.2 can
be used.
Further studies should be made in order to be able to define accurate metrics
for the presented models and parameter dependency on real world phenomena.
39
REFERENCES
[4] C. Bettstetter. Smooth is better than sharp: a random mobility model for
simulation of wireless networks. In Proceedings of the 4th ACM international
workshop on Modeling, analysis and simulation of wireless and mobile systems
(MSWIM 01), pages 19–27, 2001.
[8] T. Camp, J. Boleng, and V. Davies. A survey of mobility models for ad hoc
network research. Wireless communication & Mobile computing: Special issue
on Mobile Ad Hoc Networking: Research, Trends and Applications, 2(5):483–502,
2002.
[10] I. D. Chakeres and J. P. Macker. Mobile ad hoc networking and the IETF.
ACM SIGMOBILE Mobile Computing and Communications Review, 10(1):58–60,
2006.
41
[12] C. Chen and Z. Chen. Evaluating contacts for routing in highly partitioned
mobile networks. In Proceedings of the 1st international MobiSys workshop on
Mobile opportunistic networking (MobiOpp 07), pages 17–24, 2007.
[13] Z.D. Chen, H. Kung, and D. Vlah. Ad hoc relay wireless networks over
moving vehicles on highways. In Proceedings of the 2nd ACM international
symposium on Mobile ad hoc networking and computing (MobiHoc 01), pages 247
– 250, 2001.
[16] G. Ding and B. Bhargava. Peer-to-peer file sharing over mobile ad hoc net-
works. In Proceedings of the second IEEE Annual Conference on pervasive com-
puting and communication workshops, pages 104 – 108, 2004.
[21] P. Hui and J. Crowcroft. How small labels create big improvements. In
Proceedings of the Fifth Annual IEEE International Conference on Pervasive Com-
puting and Communications Workshops (PerCom Workshops 07), pages 65–70,
2007.
[25] A. E. Kamal and J. N. Al-Karaki. A new realistic mobility model for mo-
bile ad hoc networks. In Proceedings of the IEEE International Conference on
Communications (ICC 07), pages 3370–3375, 2007.
[26] A. Khelil, C. Becker, J. Tian, and K. Rothermel. An epidemic model for infor-
mation diffusion in MANETs. In Proceedings of the 5th ACM international work-
shop on Modeling analysis and simulation of wireless and mobile systems (MSWiM
02), pages 54–60, 2002.
[27] M. Kim, D. Kotz, and S. Kim. Extracting a mobility model from real user
traces. In Proceedings of the 25th IEEE International Conference on Computer
Communications (INFOCOM 2006), pages 1–13, 2006.
[28] S. Krco, D. Cleary, and D. Parker. P2P mobile sensor networks. In Proceedings
of 38th Annual Hawaii International Conference on System Sciences (HICSS 05),
pages 324c – 324c, 2005.
[32] J. P. Macker and M. S. Corson. Mobile ad hoc networking and the IETF. ACM
SIGMOBILE Mobile Computing and Communications Review, 2(1):9–14, 1998.
[35] M. McNett and G. M. Voelker. Access and mobility of wireless PDA users.
Mobile Computing and Communications Review, 9(2):40 – 55, 2005.
[37] M. Musolesi and C. Mascolo. A community based mobility model for ad hoc
network research. In Proceedings of the 2nd international workshop on Multi-hop
ad hoc networks: from theory to reality (REALMAN 06), pages 31–38, 2006.
[48] E.M. Royer, P.M. Melliar-Smith, and L.E. Moser. An analysis of the opti-
mum node density for ad hoc mobile networks. In Proceedings of the IEEE
International Conference on Communications (ICC 2001), pages 857–861, 2001.
[50] J.A Serri. Reference architectures and management model for ad hoc sen-
sor networks. In Proceedings of the First Annual IEEE Communications Society
Conference on Sensor and Ad Hoc Communications and Networks (IEEE SECON),
pages 592 – 600, 2004.
[51] R.C. Shah, S. Roy, S. Jain, and W. Brunette. Data MULEs: modeling a three-
tier architecture for sparse sensor networks. In Proceedings of the First IEEE
International Workshop on Sensor Network Protocols and Applications, pages 30
– 41, 2003.
[55] A. Tiwari, F.L. Lewis, and S.S. Ge. Wireless sensor network for machine
condition based maintenance. In Proceedings of the 8th Control, Automation,
Robotics and Vision Conference (ICARCV 2004), pages 461–467, 2004.
[62] J. Yoon, M. Liu, and B. Noble. Sound mobility models. In Proceedings of the 9th
annual international conference on Mobile computing and networking (MobiCom
03), pages 205–216, 2003.
[63] J. Yoon, B. D. Noble, M. Liu, and M. Kim. Building realistic mobility models
from coarse-grained traces. In Proceedings of the 4th international conference on
Mobile systems, applications and services (MobiSys 06), pages 177–190, 2006.
[66] Q. Zheng, X. Hong, and S. Ray. Recent advances in mobility modeling for
mobile ad hoc network research. In Proceedings of the 42nd annual Southeast
regional conference, pages 70–75, 2004.
[67] P. Zhou, T. Nadeem, P. Kang, C. Borcea, and L. Iftode. EZCab: A cab booking
application using short-range wireless communication. In Proceedings of the
third conference on pervasive computing and communications (PerCoM 05), pages
27 – 38, 2005.