Understanding Mobility in A Social Petri Dish

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Understanding mobility in

a social petri dish


SUBJECT AREAS:
Michael Szell1, Roberta Sinatra2,8, Giovanni Petri3,9,10, Stefan Thurner1,6,7 & Vito Latora4,5,8
APPLIED PHYSICS
STATISTICAL PHYSICS,
1
THERMODYNAMICS AND Section for Science of Complex Systems, Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria, 2Center for Complex
NONLINEAR DYNAMICS Network Research and Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA, 3Institute for Scientific
STATISTICS Interchange (ISI), Via Alassio 11/c, 10126 Torino, Italy, 4School of Mathematical Sciences, Queen Mary, University of London,
London E1 4NS, United Kingdom, 5Dipartimento di Fisica e Astronomia, Università di Catania and INFN, Via S. Sofia, 64, 95123
MODELLING AND THEORY Catania, Italy, 6Santa Fe Institute, Santa Fe, NM 87501, USA, 7IIASA, Schlossplatz 1, 2361 Laxenburg, Austria, 8Laboratorio sui
Sistemi Complessi, Scuola Superiore di Catania, Via San Nullo 5/i, 95123 Catania, Italy, 9Centre for Transport Studies,
Department of Civil and Environmental Engineering, Imperial College London, London SW7 2AZ, UK, 10Complexity and Networks
Received group, Imperial College London, London SW7 2AZ, UK.
9 March 2012
Accepted Despite the recent availability of large data sets on human movements, a full understanding of the rules
21 May 2012 governing motion within social systems is still missing, due to incomplete information on the socio-economic
factors and to often limited spatio-temporal resolutions. Here we study an entire society of individuals, the
Published
players of an online-game, with complete information on their movements in a network-shaped universe and
14 June 2012 on their social and economic interactions. Such a ‘‘socio-economic laboratory’’ allows to unveil the intricate
interplay of spatial constraints, social and economic factors, and patterns of mobility. We find that the motion
of individuals is not only constrained by physical distances, but also strongly shaped by the presence of
Correspondence and socio-economic areas. These regions can be recovered perfectly by community detection methods solely based
requests for materials
on the measured human dynamics. Moreover, we uncover that long-term memory in the time-order of visited
locations is the essential ingredient for modeling the trajectories.
should be addressed to
S.T. (stefan.thurner@

U
meduniwien.ac.at) nderstanding the statistical patterns of human mobility, predicting trajectories and uncovering the mechan-
isms behind human movements1 is a considerable challenge with important practical applications to traffic
management2,3, planning of urban spaces4,5, epidemics6–9, information spreading10,11, and geo-marketing12,13.
In the last years, advanced digital technologies have provided huge amounts of data on human activities, allowing
to extract information on human movements. For instance, observations of banknote circulation14,15, mobile phone
records16, online location-based social networks17,18, GPS location data of vehicles19, or radio frequency identifica-
tion traces1,5,20, have all been used as proxies for human movements. These studies have provided valuable insights
into several aspects of human mobility, uncovering distinct features of human travel behaviour such as scaling
laws14,21, predictability of trajectories22, and impact of motion on disease spreading7–9,23. However, from a compar-
ative analysis of the different works it emerges clearly that a ‘‘unified theory’’ of human mobility is still outstanding,
since results, even on some very basic features of the motion, often appear to be contrasting1. One example is the
measured distribution of human trip lengths in various types of transportation: some studies agree that mobility is
generally characterized by fat-tailed distributions of trip lengths14,21, while others report exponential or binomial
forms1,5,19. The discrepancies arise due to the different mobility data sets used, where mobility is indirectly inferred
from some specific human activity in a particular context. For instance, mobile phone records typically provide
location information only when a person uses the phone21, while radio frequency identification traces like the ones
of Oyster cards in the London subway5 only log movements based on public transportation systems. Analyses of
these data sets can then result in a possibly biased view of the underlying mobility processes. Furthermore, most of
the analyzed data sets have poor information on how socio-economic factors influence human mobility patterns.
More generally, the lack of an all-encompassing record set with positional raw data, including complete informa-
tion on the socio-economic context and on the behaviour of all members of a human society, has so far limited the
possibilities for a comprehensive exploration of human mobility.
Here, we address the issue of mobility from a novel point of view by analyzing, with unprecedented precision,
the movements of a large number of individuals, the players of a self-developed massive multiplayer online game
(MMOG). Such online platforms provide a fascinating new way of observing hundreds of thousands of inter-
acting individuals who are simultaneously engaged in social and economic activities. The potential of online
worlds as largescale ‘‘socio-economic laboratories’’ has been demonstrated in a number of previous studies25–28.

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 1


www.nature.com/scientificreports

For the MMOG at hand29, we have access to practically all actions30,


including movements, accumulated over several years. This MMOG
can therefore be considered as a ‘‘socio-economic petri dish’’ to study
mobility in a completely controlled way. We can in fact observe the
long-time evolution of a social system at the scale of an entire human
society, having a perfect knowledge of all the spatio-temporal and
socio-economic details. In contrast to traditional studies in social
science which are typically biased by well-known ‘‘interviewer-
effects’’, in MMOGs the socio-economic measurements are objective
and unobtrusive, since subjects are not consciously aware of being
observed.
Using positional data of the players in the game universe, in com-
bination with other socio-economic information from the game, we
uncover various fundamental features of mobility, and we provide a
complete description of the mechanisms causing the observed anom-
alous diffusion. Two are the main results of our work. First, we find
the emergence of different spatial scales, due to the strong tendency
of the players to limit their economic activities to some specific areas
over long time periods and to avoid crossing the borders between
different areas. Making use of this observation, we propose an effi-
cient method to identify socio-economic regions by means of com-
munity detection algorithms based solely on the measured
movement dynamics. Our second result unveils the driving mech-
anism behind the movement patterns of players: Locations are visited
Figure 1 | The universe map of the massive multiplayer online game
in a specific order, leading to strong long-term memory effects which
Pardus. The universe of Pardus can be represented as a network24 with N 5
are essential to understand and reproduce the observed trajectories.
400 nodes, called sectors (playing the role of cities), and K 5 1160 links.
Finally, we provide large-scale evidence that neglecting either of
Sectors are organized into 20 different regions, called clusters, shown in the
these spatial or temporal constraints may obstruct the possibility of
figure as different colour-shaded areas. There is no explicit set of goals in
understanding the processes behind human mobility.
the game. Players are free to interact in a number of ways to e.g. increase
Pardus is a massive multiplayer online game running since 2004,
their virtual wealth or status. Players move between sectors to interact with
with a worldwide player base of more than 350,000 individuals. It is
other players, e.g. to trade, attack, wage war, or to explore the virtual world.
an open-ended game whose players live in a virtual, futuristic uni-
verse and interact with each other in a multitude of ways. The topo-
logy of the universe can be represented as a network with 400 nodes, In Pardus, players are free to pursue whichever role they like to
called sectors, embedded in a two-dimensional space, the so-called take. Many of them focus on expanding their social relations or
universe map shown in Fig. 1. Each sector is like a city where players political influence, some play the role of ‘‘scientists’’ exploring the
can have social relations (establish new friendships, make enemies universe, while others choose their main goal in trade and optimizing
and wage wars), and entertain economic activities (trade and pro- the amount of virtual money earned25. The large variety of complex
duction of commodities). Typically, sectors adjacent on the universe socio-economic behaviours emerging in this online society, results in
map, as well as a few far-apart sectors, are interconnected by links high heterogeneity in the mobility patterns, such as observed in real
which allow players to move from sector to sector. At any point in human motion. However, differently from other empirical studies on
time, each sector is usually attended by a large number of players. The human movements, mobility in Pardus can be investigated in a con-
network is sparse and, similarly to other spatial networks, is not a trolled way, since complete information on actions of players is
small world. It has a characteristic path length L 5 11.89 and a available25,26. In this article we consider a data set consisting of move-
diameter dmax 5 27, which means that, on average, players have to ments in the network universe of all players who were active over a
move through a non-negligible number of sectors to traverse the period of 1,000 days, as well as of socio-economic information about
universe. See Supplementary Section S3 and Supplementary Table their environment. This opens the possibility of investigating motion
I for a detailed characterization of the universe network structure. in relation to other social and economic factors. Note that we do not
The sectors have been originally organized by the developers of the have to address the common issues of relying on incomplete data, on
game into 20 different clusters, which are perceived by the players as data that are only a proxy of mobility, or on data that are aggregates
different political or socio-economic regions such as countries. For of different types of transportation9. See Supplementary Section S2
example, a player who is member of a political faction in the game is for more details on the data set.
provided some game-relevant protection in all clusters which are
controlled by the faction, and has the opportunity of social pro- Results
motion when accomplishing certain tasks within these clusters.
Basic features of the motion. The position of each player in the
Each cluster is shown in Fig. 1 with a different background colour.
universe, namely the ID number of the sector where the player is
All clusters contain about 20 sectors each, with the exception of the
currently situated, is logged once a day. In this way the motion of
central cluster, consisting of just one sector, and its surrounding
three clusters having only 6–7 sectors. Sectors belonging to the same each player becomes a time series of 1,000 sector positions. A jump
cluster are geographically close on the map, meaning that the dis- occurs when a player’s sector position changes from one day to the
tance between any two sectors in the same cluster is small, with an following. The associated length d of a jump is measured in terms of
average distance around 3. Players typically have a ‘‘home cluster’’ graph distance, an integer value between 1 and dmax 5 27. The
where they focus their socio-economic activities over long time per- probability distribution of jump distances, computed for all players
iods. Occasionally, they also move to sectors belonging to other over the whole observation period, is reported in Figure 2 (a). For
clusters in order to explore the universe, to relocate their home d # 15, the distribution is well-fitted by an exponential:
d
(migrate), or during extreme game events such as wars. Pðd Þ*e{l , ð1Þ

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 2


www.nature.com/scientificreports

power-law distribution:
PðDt Þ*Dt {b ð2Þ

with an exponent b < 2.2, in agreement with other recent measure-


ments on human dynamics33. In addition, we found that the average
waiting times of individual players are distributed as a power-law (see
Supplementary Fig. 2). This implies a strong heterogeneity in the
motion of different players, which is related to the heterogeneity
in their general activity (see Supplementary Section S1 and
Supplementary Fig. 1).

Mobility reveals socio-economic clusters. Mobility patterns are


influenced by the presence of the socio-economic regions in the
Figure 2 | Distribution of jump distances and of waiting times. To each network, highlighted in colours in Fig. 1. The typical situation is
player a time series consisting of the sector positions over 1000 days is illustrated in Fig. 3 (a), with jumps within the same cluster being
associated. A jump is said to occur when the sector position in the time preferred to jumps between sectors in different clusters. In order to
series changes from one day to the following. The length d of a jump is quantify this effect, we report in Fig. 3 (b), blue circles, the observed
measured in terms of graph distance and can take an integer value between number of jumps of length d within the same cluster, divided by the
1 and dmax 5 27, the diameter of the network. (a) The probability total number of jumps of length d. This ratio is a decreasing function
distribution of jump distances is reported in a semi-log plot. For d # 15, of the distance d, and reaches zero at d 5 12, since no sectors at such
d
the distribution follows an exponential Pðd Þ*e{l with a characteristic distance do belong to the same cluster. As a null model we report the
length l < 3. Players can also remain in the same sector for more days, fraction of sector pairs at distance d which belong to the same cluster,
without moving to other sectors. We define as waiting time Dt the number see red squares in the same figure. The significant discrepancy
of consecutive days a player spends in only one sector. (b) We show the between the two curves indicates that players indeed tend to avoid
probability distribution of waiting times Dt in a log-log plot, which is well crossing the borders between clusters. For example, a jump of length
fitted by a power-law P(Dt) , Dt2b, with b < 2.2. d 5 8 from one sector to another sector in the same cluster is
expected only in 3% of the cases, while it is observed in about 20%
with a characteristic jump length l < 3. The existence of a typical of the cases. Now, the propensity of a player to spend long time
travel distance, as also recently found in other mobility data5,19, is periods within the same cluster might be simply related to the
related to the use of a single transportation mode in Pardus31. This topology of the network, as in the case of random walkers whose
allows to disentangle the intrinsic heterogeneity of the players from motions are constrained on graphs with strong community
the effects due to the presence of different means of transportation9, structures34. Nodes belonging to the same cluster are in fact either
which might be the cause of the scale-free distributions found in directly connected or are at short distance from one another.
mobile phone or other mobility data sets14,16. It has in fact been This proximity is reflected in the block-diagonal structure of the
suggested that power laws in distance distributions of movement adjacency matrix A and of the distance matrix D, respectively
data may emerge from the coexistence of different scales1,32. shown in Fig. 4 (a) and (b). We have therefore checked whether
In some cases, players stay in the same sector for a number of the presence of the socio-economic clusters originally introduced
consecutive days. For instance, 11 of the 1458 considered players, by the developers of the game can be derived solely from the
although being active in the game, never jump within the entire structure of the network. For this reason we adopted standard
observation period. On average, a player does not change sector in community detection methods based on the adjacency and on the
approximately 75% of the days. To better characterize the motion, we distance matrix35,36. The results, reported respectively in Fig. 4 (d)
computed the waiting times Dt (measured in terms of number of and (e), show that detected communities deviate significantly from
days) between all pairs of consecutive jumps, over all players. The the clusters, implying that in our online world the socio-economic
distribution of these waiting times, shown in Fig. 2 (b) follows a regions cannot be recovered merely from topological features. In

Figure 3 | Influence of socio-economic clusters on mobility. (a) Sketch of jump patterns from a sector i to sectors within the same cluster, j and l, and to
sectors in a different cluster, j9, l9. Although sectors j9 and l9 have the same graph distance from sector i as sectors j and l respectively, transitions across
cluster border have smaller probabilities. (b) Quantitative evidence of the tendency of players to avoid crossing borders. Red squares show the null model,
i.e. the fraction of all pairs of sectors at a given distance d being in the same cluster. Blue circles show the fraction of measured jumps leading into the same
cluster, per distance. Coincidence of the two curves would indicate that clusters have no effect on mobility. Clearly this is not the case – there is a strong
tendency of players to avoid crossing the borders between clusters.

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 3


www.nature.com/scientificreports

(a) Adjacency A (b) Distance D (c) Transition count M


106

1 20 4
10

10 102
0

0
0 10

Figure 4 | Extracting communities from network topology and from mobility patterns. (a) The adjacency matrix A of the universe network, (b) the
matrix D of shortest path distances, and (c) the matrix M of transition counts of player jumps. Each of the three matrices contains 400 3 400 entries,
whose values are colour-coded. Sector IDs are ordered by cluster, resulting in the block-diagonal form of the three matrices. We have used modularity-
optimization algorithms to extract community structures from the information encoded in the three matrices. Different node colours represent the
different communities found, while the 20 different colour-shaded areas indicate the predefined socio-economic clusters as in Fig. 1. The displayed
Fowlkes and Mallows index F [½0, 1 quantifies the overlap of the detected communities with the predefined clusters. The closer F is to 1, the better the
match, see Supplementary Section S4. (d) Although information contained in the adjacency matrix A allows to find 18 communities, a number close to
the real number of clusters, the communities extracted do not correspond to the underlying colour-shades areas (F ~0:68). (e) Extracting communities
from the distance matrix D only results in 6 different groups (F ~0:49). (f) The 23 communities detected using the transition count matrix M reproduce
almost perfectly the real socio-economic clusters (F ~0:96), with only a few mismatched nodes detected as additional clusters. For more measures
quantifying the match of communities, see Supplementary Table II.

comparison we considered the player transition count matrix M, avoid crossing borders. We have therefore considered a Markov
shown in Fig. 4 (c), which displays a similar block-diagonal model in which each walker moves from a current node i to a node
structure as A and D, but with the qualitative difference that it j with a transition probability pij 5 mij/Sl mil, where mij is the num-
contains dynamic information on the system. Figure 4 (f) shows ber of jumps between sector i and sector j, as expressed by the trans-
that community detection methods applied to the transition count ition count matrix M of Fig. 4 (c). The probabilities pij are the entries
matrix M reveal almost perfectly all the socio-economic areas of the of the transition probability matrix P, which contains all the
universe. This finding demonstrates that mobility patterns contain information on the day-to-day movement of real players, such as
fundamental information on the socio-economic constraints present the preference to move within clusters, the length distribution of
in a social system. Therefore, a community detection algorithm jumps, as well as the tendency to remain in the same sector.
applied to raw mobility information, as the one proposed here, is Despite this detailed amount of information used (the matrix P
able to extract the underlying socio-economic features, which are has 160,000 elements), the Markov model fails to reproduce the
instead invisible to methods based solely on topology. For a asymptotic behaviour of the MSD, see magenta diamonds in Fig. 5
detailed treatment of adopted community detection methods and (b). Since the model considers only the position of the individual at
measures see Supplementary Section S4, Supplementary Table II its current time to determine its position at the following time, devia-
and Supplementary Figs. 4 and 5. tions from empirical data appear presumably due to the presence of
higher-order memory effects37. For this reason we have considered
A long-term memory model. In order to characterize the diffusion the recently proposed preferential return model21 which incorporates
of players over the network, we have computed the mean square a strong memory feature. The model is based on a reinforcement
displacement (MSD) of their positions, s2(t), as a function of time. mechanism which takes into account the propensity of individuals to
Results reported in Fig. 5 (a) indicate that, for long times, the MSD return to locations they visited frequently before. This mechanism is
increases as a power-law: able to reproduce the observed tendency of individuals to spend most
s2 ðt Þ*t u ð3Þ of their time in a small number of locations, a tendency which is also
prevalent in the mobility behaviour of Pardus players (see
with an exponent u < 0.26. This anomalous subdiffusive behaviour is Supplementary Fig. 3). However, the implementation of the pref-
not a simple effect of the topology of the Pardus universe. In fact, as erential return model on the Pardus universe network is not able
shown in Fig. 5 (b), gray stars, the simulation of plain random walks to capture the scaling patterns of the MSD, as shown in Fig. 5 (b).
on the same network produces a standard diffusion with an exponent The reason is that in the model the probability for an individual to
u < 1 up to t < 100 days, and then a rapid saturation effect which is move to a given location does not depend on the current location, nor
not present in the case of the human players. on the order of previously visited locations. Instead, we observe
Insights from the previous section suggest that the anomalous that in reality individuals tend to return with higher probability to
diffusion behaviour might be related to the tendency of players to sectors they have visited recently and with lower probability to

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 4


www.nature.com/scientificreports

Figure 5 | Diffusion scaling in empirical data and simulated models. (a) The mean square displacement (MSD) of the positions of players follows a
power relation s2(t) , tu with a subdiffusive exponent u < 0.26. The inset shows the average probability P/- ðtÞ for a player to return after t jumps to a
sector previously visited. The curve follows a power law P/- ðtÞ*t{a with an exponent of a < 1.3 and an exponential cutoff. We report, for comparison,
(b) the MSD for various models of mobility. For random walkers and in the case of a Markov model with transition probability pij 5 mij/Sj mij we observe
an initial diffusion with an exponent u < 1 and then a rapid saturation of s2(t), due to the finite size of the network. A preferential return model also shows
saturation and does not fit the empirical observed scaling exponent u. Conversely, a model with long-time memory (Time Order Memory) reproduces the
exponent almost perfectly. Such a model makes use of the empirically observed P/- ðtÞ while the Markov model and the preferential return model over-
emphasize preferences to locations visited long ago and do not recreate the empirical curve well. Curves are shifted vertically for visual clarity.

sectors visited a long time before. Consequently a sector that has Discussion
been visited many times but with the most recent visit dating back The flat slope of u < 0.26 and the lack of saturation of the MSD of the
one year has a lower probability to be visited again than a sector players over the whole observation period exposes the significant
that has been visited just a few times but with the last visit dating level of subdiffusivity in the motions of individuals, consistent with
back only one week. previous findings21,38–41. However, the mere tendency of individuals
To highlight this mechanism we measured the return time distri- to return to already visited locations is not sufficient to capture these
bution in the jump-time series (see Methods). In particular, we subdiffusive properties of the MSD, but it is fundamental to consider
extracted the probability P/- ðtÞ for an individual to return again a mechanism that takes into account the temporal order of visited
(for the first time) to the currently occupied sector after t jumps. locations, as achieved by the TOM model. Moreover, the TOM
As shown in the inset of Fig. 5 (a), we found that the return time model is realistic in the sense that, in contrast to Markov models,
distribution reads it takes into account the tendency of individuals to develop a pref-
P/- ðtÞ*t{a ð4Þ erence for visiting certain locations. At the same time it allows for the
possibility that a previously preferred location becomes not fre-
quented anymore. This view provides an alternative to recently sug-
with an exponent a < 1.3. We used this information for constructing
gested reinforcement mechanisms in preferential return models21.
a model which takes into account the higher re-visiting probability of
The possibility for individuals to ‘‘change home’’ is relevant when
recently explored locations. In this way we can capture the long-term
the model should be able to account for migration, which is an
scaling properties of movements. Exactly these asymptotic properties important feature in the long-time mobility behaviour of humans.
are fundamentally relevant for issues of epidemics spreading or traf-
Finally, we discuss to which extent the findings from our ‘‘social
fic management.
petri dish’’ are valid also for human populations unrelated to the
This ‘‘Time Order Memory’’ (TOM) model incorporates a power- game. Previous analyses of human social behaviour in Pardus25,26
law distribution of first return times, together with a power-law have shown agreement with well-known sociological theories and
distribution of waiting times and an exponential distribution of jump with properties on comparable behavioural data. Examining the
distances, as those observed empirically in Fig. 2. We show below that preference of players to move within socio-economic regions is of
these ingredients are sufficient to reproduce the subdiffusive beha- obvious importance for clearing up the role of political or socio-
viour reported in Fig. 5 (a). The model works as follows: an individual economic borders on the movement and migration of humans,
stands still in a given sector for a number of days drawn from the where the presence of borders has a strong influence on mobil-
waiting time distribution, Eq. (2). Then, the individual jumps. There ity15,42–44. Online societies as the one of Pardus have the evident
are two possibilities: (i) with a probability v she returns to an already potential to serve as ‘‘socio-economic laboratories’’, where the com-
visited sector, (ii) with the probability 1 – v she jumps to a so far plete knowledge of activities, social relations, and positions of all
unexplored sector. In case (i), one of the previously visited sectors is individuals can significantly advance our understanding of large-
chosen according to Eq. (4). In the exploration case (ii), the indi- scale human behaviour, in particular of mobility.
vidual draws a distance d from the distance distribution, Eq. (1), and
jumps to a randomly selected, unexplored sector at that distance. The
model has four parameters. The parameters l, b and a of equations Methods
Data set. We focus on one of the three Pardus universes, Artemis. For this universe,
(1), (2) and (4) respectively, are fixed by the data. Further, averaging we extract player mobility data from day 200 to day 1200 of its existence. We discard
over all jumps and players, the probability of returning to an already the first 200 days because social networks between players of Pardus have shown
visited location is v < 0.83. Similarly to the measured data, the MSD aging effects in the beginning of the universe, i.e. there seems to exist a transient phase
of the TOM model, black squares in Fig. 5 (b), exhibits no saturation in the development of the society, possibly affecting mobility, which we would like to
avoid considering25. To make sure we only consider active players, we select all who
effects and displays an exponent uTOM 5 0.23 6 0.02 (the error is exist in the game between the days 200 and 1200, yielding 1458 players active over a
calculated over an ensemble of realizations) in agreement with the time-period of 1000 days. The sector IDs of these players, i.e. their positions on the
exponent observed for the players. universe network’s nodes, are logged every day at 05:35 GMT. Players typically log in

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 5


www.nature.com/scientificreports

once a day and perform all their limited movements of the day within a few minutes, 20. Cattuto, C. et al. Dynamics of person-to-person interactions from distributed rfid
see Supplementary Section S1. The legal department of the Medical University of sensor networks. PloS one 5, e11596 (2010).
Vienna has attested the innocuousness of the used anonymized data. 21. Song, C., Koren, T., Wang, P. & Barabási, A. Modelling the scaling properties of
human mobility. Nature Physics 6, 818–823 (2010).
Transition count matrix and transition probability matrix. The entry mij of the 22. Song, C., Qu, Z., Blumm, N. & Barabási, A. Limits of predictability in human
transition count matrix M is equal to the number of times a player’s position was on mobility. Science 327, 1018 (2010).
sector i and then, on the following day, on sector j. This number is cumulated for all 23. Belik, V., Geisel, T. & Brockmann, D. Natural human mobility patterns and spatial
players. The entry pij of the transition probability matrix P corresponds to the spread of infectious diseases. Phys. Rev. X 1, 011001 (2011).
probability that a player moves to a sector j given that on the previous day the player’s 24. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. Complex networks:
mij structure and dynamics. Phys. Rep. 424, 175–308 (2006).
location was sector i. It reads: pij ~ P , where mij is the number of observed player
l mil 25. Szell, M. & Thurner, S. Measuring social dynamics in a massive multiplayer online
movements from sector i to sector j, and the sum over l is over all sectors of the game. Social Networks 32, 313–329 (2010).
universe. The matrix P is a stochastic matrix, i.e. it has the property that the entries of 26. Szell, M., Lambiotte, R. & Thurner, S. Multirelational organization of large-scale
each row sum to one. social networks in an online world. Proc. Natl. Acad. Sci. USA 107, 13636–13641
(2010).
MSD and diffusion. The MSD is defined as s2 (t) 5 Æ(r (T 1 t) 2 r (T))2æ, where r (T) 27. Castronova, E. On the research value of large games. Games and Culture 1, 163–
and r (T 1 t) are the sectors a player occupies at times T and T 1 t respectively, and 186 (2006).
where by (r (T 1 t) 2 r (T)) we denote the distance between the two sectors. The 28. Bainbridge, W. The scientific research potential of virtual worlds. Science 317, 472
average Æ N æ is performed over all windows of size t, with their left boundaries going (2007).
from T50 to T51000-t, and over all the 1458 players in the data set. If s2 has the form 29. www.pardus.at.
s2(t) , tu with an exponent u , 1, the diffusion process is subdiffusive, in the case 30. Thurner, S., Szell, M. & Sinatra, R. Emergence of good conduct, scaling and Zipf
u . 1 it is super-diffusive. An exponent of u 5 1 corresponds to classical brownian laws in human behavioral sequences in an online world. PLoS ONE: 7, e29796
motion38,39. (2012).
31. Kölbl, R. & Helbing, D. Energy laws in human travel behaviour. New J. of Phys. 5,
Jump-time and first return time distribution. We transform the time-series of daily 48 (2003).
sector IDs occupied by the players from real-time to jump-time, in order to be able to 32. Han, X., Hao, Q., Wang, B. & Zhou, T. Origin of the scaling law in human
compare time-series of different length and to focus on the movements between mobility: Hierarchy of traffic systems. Phys. Rev. E 83, 036117 (2011).
sectors. An example of this conversion is provided: a time series [5, 5, 5, 32, 32, 104, 5, 33. Barabási, A. The origin of bursts and heavy tails in humans dynamics. Nature 435,
5, 104, 104, 104, 32, 337, 337, 32…] becomes in jump-time [5, 32, 104, 5, 104, 32, 337, 207 (2005).
32, …]. We denote jump-time by the greek letter t, that is, at jump-time t a player has 34. Fortunato, S. Community detection in graphs. Phys. Rep. 486, 75–174 (2010).
performed exactly t jumps. We use t in the computation of the first return time 35. Arenas, A., Fernández, A. & Gómez, S. Analysis of the structure of complex
distribution. In the hypothetical time series of sectors [5, 32, 104, 5, 104, 32, 337, 32] a networks at different resolution levels. New J. of Phys. 10, 053039 (2008).
first return to a sector lying t 5 1 jumps back happens 2 times (104, 5, 104 and 32, 337, 36. Newman, M. Analysis of weighted networks. Phys. Rev. E 70, 056131 (2004).
32), for t 5 2 this happens once (5, 32, 104, 5), for t 5 3 also c, P/- ð1Þ~0:5, 37. Sinatra, R., Condorelli, D. & Latora, V. Networks of motifs from sequences of
P/- ð2Þ~P/- ð3Þ~0:25, where the sum over all P/- ðtÞ is equal to 1. symbols. Phys. Rev. Lett. 105, 178702 (2010).
38. West, B., Grigolini, P., Metzler, R. & Nonnenmacher, T. Fractional diffusion and
levy stable processes. Phys. Rev. E 55, 99 (1997).
1. Barthélemy, M. Spatial networks. Phys. Rep. 499, 1–101 (2010). 39. Metzler, R. & Klafter, J. The random walk’s guide to anomalous diffusion: a
2. Guimerà, R., Mossa, S., Turtschi, A. & Amaral, L. The worldwide air fractional dynamics approach. Phys. Rep. 339, 1–77 (2000).
transportation network: anomalous centrality, community structure, and cities 40. Scafetta, N., Latora, V. & Grigolini, P. Lévy statistics in coding and non-coding
global roles. Proc. Natl. Acad. Sci. USA 102, 7794–7799 (2005). nucleotide sequences. Phys. Lett. A 299, 565–570 (2002).
3. Helbing, D. Traffic and related self-driven many-particle systems. Rev. of Mod. 41. Viswanathan, G. et al. Optimizing the success of random searches. Nature 401,
Phys. 73, 1067 (2001). 911–914 (1999).
4. Makse, H. A., Havlin, S. & Stanley, H. E. Modelling urban growth patterns. Nature 42. Ratti, C. et al. Redrawing the map of Great Britain from a network of human
377, 608–612 (1995). interactions. PLoS One 5, e14248 (2010).
5. Roth, C., Kang, S. M., Batty, M. & Barthélemy, M. Structure of urban movements: 43. Newman, D. The lines that continue to separate us: borders in our borderless
Polycentric activity and entangled hierarchical flows. PLoS ONE 6, e15923 (2011). world. Progress in Human Geography 30, 143 (2006).
6. Pastor-Satorras, R. & Vespignani, A. Epidemic spreading in scale-free networks. 44. Lambiotte, R. et al. Geographical dispersal of mobile communication networks.
Phys. Rev. Lett. 86, 3200–3203 (2001). Physica A 387, 5317–5325 (2008).
7. Colizza, V., Barrat, A., Barthélemy, M. & Vespignani, A. The role of the airline
transportation network in the prediction and predictability of global epidemics.
Proc. Natl. Acad. Sci. USA 103, 2015 (2006).
8. Hufnagel, L., Brockmann, D. & Geisel, T. Forecast and control of epidemics in a
globalized worlds. Proc. Natl. Acad. Sci. USA 101, 15124–15129 (2004).
Acknowledgments
This work was conducted under the HPC-EUROPA2 project (project number: 228398)
9. Balcan, D. et al. Multiscale mobility networks and the spatial spreading of
with the support of the European Commission – Capacities Area – Research Infrastructures
infectious diseases. Proc. Natl. Acad. Sci. USA 106, 21484–21489 (2009).
initiative, and within the framework of European Cooperation in Science and Technology
10. Miritello, G., Moro, E. & Lara, R. Dynamical strength of social ties in information
Action MP0801 Physics of Competition and Conflicts. M.S. and S.T. acknowledge support
spreading. Phys. Rev. E 83, 045102 (2011).
from the Austrian Science Fund Fonds zur Förderung der wissenschaftlichen Forschung P
11. Onnela, J. et al. Structure and tie strengths in mobile communication networks.
23378, and from project EU FP7 – INSITE. M.S., R.S. and G.P. also thank the Santa Fe
Proc. Natl. Acad. Sci. USA 104, 7332 (2007).
Institute for the opportunities offered during the Complex Systems Summer School 2010,
12. Quercia, D., Lathia, N., Calabrese, F., Di Lorenzo, G. & Crowcroft, J.
where this project originated.
Recommending social events from mobile phone location data. In Data Mining
(ICDM), 2010 IEEE 10th International Conference on, 971–976 (2010).
13. Jensen, P. Network-based predictions of retail store commercial categories and Author contributions
optimal locations. Phys. Rev. E 74, 035101 (2006). All the authors have equally contributed to the design of the study, to the analysis and
14. Brockmann, D., Hufnagel, L. & Geisel, T. The scaling laws of human travel. Nature interpretation of the results and to the preparation of the manuscript.
439, 462–465 (2006).
15. Thiemann, C., Theis, F., Grady, D., Brune, R. & Dirk Brockmann, D. The structure
of borders in a small world. PLoS one 5, e15422 (2010). Additional information
16. González, M., Hidalgo, C. & Barabási, A. Understanding individual human Supplementary information accompanies this paper at http://www.nature.com/
mobility patterns. Nature 453, 779–782 (2008). scientificreports
17. Scellato, S., Noulas, A., Lambiotte, R. & Mascolo, C. Socio-spatial properties of Competing financial interests: The authors declare no competing financial interests.
online location-based social networks. Proceedings of ICWSM 11 (2011).
18. Scellato, S., Musolesi, M., Mascolo, C., Latora, V. & Campbell, A. Nextplace: A License: This work is licensed under a Creative Commons
spatio-temporal prediction framework for pervasive systems. Pervasive Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this
Computing 152–169 (2011). license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/
19. Bazzani, A., Giorgini, B., Rambaldi, S., Gallotti, R. & Giovannini, L. Statistical laws How to cite this article: Szell, M., Sinatra, R., Petri, G., Thurner, S. & Latora, V.
in urban mobility from microscopic gps data in the area of florence. J. Stat. Mech. Understanding mobility in a social petri dish. Sci. Rep. 2, 457; DOI:10.1038/srep00457
2010, P05001 (2010). (2012).

SCIENTIFIC REPORTS | 2 : 457 | DOI: 10.1038/srep00457 6

You might also like