arXiv:cond-mat/0205601v1 28 May 2002
Efficiency of Scale-Free Networks:
Error and Attack Tolerance
Paolo Crucitti a, Vito Latora b, Massimo Marchiori c,d, and
Andrea Rapisarda b
a Scuola
Superiore di Catania, Via S. Paolo 73, 95123 Catania, Italy
b Dipartimento
di Fisica e Astronomia, Università di Catania,
and INFN sezione di Catania, Corso Italia 57, 95129 Catania, Italy
c W3C
and Lab. for Computer Science, Massachusetts Institute of Technology,
USA
d Dipartimento
di Informatica, Università di Venezia, Italy
Abstract
The concept of network efficiency, recently proposed to characterize the properties
of small-world networks, is here used to study the effects of errors and attacks on
scale-free networks. Two different kinds of scale-free networks, i.e. networks with
power law P(k), are considered: 1) scale-free networks with no local clustering produced by the Barabasi-Albert model and 2) scale-free networks with high clustering
properties as in the model by Klemm and Eguı́luz, and their properties are compared
to the properties of random graphs (exponential graphs). By using as mathematical
measures the global and the local efficiency we investigate the effects of errors and
attacks both on the global and the local properties of the network. We show that the
global efficiency is a better measure than the characteristic path length to describe
the response of complex networks to external factors. We find that, at variance with
random graphs, scale-free networks display, both on a global and on a local scale,
a high degree of error tolerance and an extreme vulnerability to attacks. In fact,
the global and the local efficiency are unaffected by the failure of some randomly
chosen nodes, though they are extremely sensititive to the removal of the few nodes
which play a crucial role in maintaining the network’s connectivity.
Key words: Structure of Complex Networks, Scale-Free Networks
PACS: 89.75.-k, 89.75.Fb, 05.90.+m
Preprint submitted to Elsevier Preprint
1 February 2008
1
Introduction
The study of the structural properties of the underlying network can be very
important to understand the functions of a complex system [1]. For instance
the architecture of a computer network is the first critical issue to take into account when we want to design an efficient communication system. Similarly,
the efficiency of the communication and of the navigation over the Net is
strongly related to the topological properties of the Internet and of the World
Wide Web. The connectivity structure of a population (the set of social contacts) affects the way ideas are diffused, but also the spreading of epidemics
over the network. Only very recently the increasing accessibility of databases
of real networks on one side, and the availability of powerful computers on
the other side, have made possible a series of empirical studies on the properties of biological, technological and social networks. The results obtained have
shown that, in most cases, real networks are very different from random and
regular networks, and display some common properties as high efficiency and
high degree of robustness. The literature on complex networks has followed an
exponential growth in the last few years; a comprehensive review can be found
in Refs.[2–4]. In the following we enumerate some of the results appeared in
the recent literature that are important in order to understand the purpose of
this paper:
(1) In ref. [5], Watts and Strogatz have shown that the connection topology of some real networks is neither completely regular nor completely
random. These networks, named small-world networks [6], exhibit in fact
high clustering coefficient, like regular lattices, and small average distance
between two generic points (small characteristic path length), like random graphs. Watts and Strogatz have also proposed a simple model (the
WS model) to construct networks with small-world properties (i.e. networks with high clustering and small average distance), by rewiring few
edges of a regular lattice.
(2) In ref.[7] two of us have introduced the concept of efficiency of a network, which measures how efficiently the information is exchanged over
the network. By using the efficiency as a new measure to characterize
the network, it has been showed that small-worlds are systems that are
both globally and locally efficient. Moreover the description of a network in terms of its efficiency extends the small-world analysis also to
unconnected networks and to real systems that are better represented as
weighted networks [8–10].
(3) Small average distance and high clustering are not all the common features of complex networks. Barabasi and collaborators have studied P (k),
the degree distribution of a network, and found that many large networks
(the World Wide Web, Internet, metabolic networks and protein networks) are scale-free, i.e. have a power-law degree distribution P (k) ∼ k −γ
2
[11–16]. Neither random graph theory [17], nor the WS model to construct
networks with the small-world properties [5] can reproduce this feature:
in fact both give P (k) peaked around the average value of k. In ref. [12]
Barabasi and Albert have proposed a simple model (the BA model) to
construct a scale-free topology by modeling the dynamical growth of the
network: some ad hoc assumptions in the network dynamics result in a
network with the correct scale-free features, i.e. with a power-law degree
distribution P (k) ∼ k −3 .
Moreover in ref.[14] the authors have shown that scale-free networks, at
variance with random networks, display a high degree of error tolerance.
That is the ability of their nodes to communicate is unaffected by the
failure of some randomly chosen nodes. However, error tolerance comes
at a high price in that scale-free networks are extremely vulnerable to
attacks, i.e. to the removal of a few nodes which play a crucial role in
maintaining the network’s connectivity. Such error tolerance and attack
vulnerability typical of scale-free networks have also been found in real
networks [14].
(4) The BA scale-free model produces networks with a power law connectivity
distribution, but not with small-world properties. In fact the BA scalefree networks have small average distance between two generic points, the
first property of a small-world network, while they lack of high clustering,
the other property of a small-world network. More recently Klemm and
Eguı́luz [18] have proposed an alternative model (the KE model) to construct networks where scale-free degree distributions coexist with small
average distances and with strong clustering. Therefore, the KE model
reproduces, at the same time, the two distinct features present in real
networks: power law degree distribution and the small-world behavior.
In this paper we use the concept of global and local efficiency to characterize
the properties of scale-free networks (i.e. networks with power law degree
distributions), and to study their error and attack tolerance. We consider both
scale-free networks with no clustering (the BA model), and scale-free networks
with high clustering properties (the KE model). We analyze the effect of errors
and attacks not only on the global properties of the network (as done in ref.[14]
by using as a measure the average distance between two points) but also on
the local properties of the network. Moreover we compare the results obtained
in terms of global and local efficiency of the network with the results in terms
of average distance and clustering coefficient. The three innovative point of
our paper are:
• The use of the efficiency measure to characterize scale-free networks. This
allows to avoid problems due to the divergence of the average distance.
• The parallel study of scale-free networks with no clustering, and scale-free
networks with high clustering.
• The study of the effect of errors and attacks not only on the global proper3
tied, but also on the local properties of the network.
The paper is organized as follows. In Section 2 we define the variable efficiency
and we illustrate how the small-world behavior can be expressed in terms of
the local and the global efficiency of the network. In Section 3 we discuss
the relevance and the properties of scale-free networks, and we illustrate the
BA model and the KE model. In Section 4, the central part of the paper,
we investigate the effects of errors and attacks both on the global and on the
local properties of scale-free networks. We show that the efficiency is a better
measure than the characteristic path length to describe the global properties
of complex networks, especially when a large number of nodes is removed.
The local properties of the scale-free networks are equally well described by
the local efficiency or by the clustering coefficient. By considering both BA and
KE scale-free networks, we show that scale-free networks are systems resistent
to errors but vulnerable to attacks both at a global and at a local level. In
Section 5 we draw the conclusions.
2
Small-World behavior and Efficiency of a Network
In their seminal paper Watts and Strogatz have shown that the connection
topology of some real (biological, social and technological) networks is neither completely regular nor completely random [5]. Watts and Strogatz have
named these networks, that are somehow in between regular and random
networks, small-worlds, in analogy with the small-world phenomenon, empirically observed in social systems more than 30 years ago [6]. The mathematical
characterization of the small-world behavior is based on the evaluation of two
quantities, the characteristic path length L, measuring the typical separation
between two generic nodes in the network and the clustering coefficient C,
measuring the average cliquishness of a node. Small-world networks are in
fact highly clustered, like regular lattices, yet having small characteristic path
lengths, like random graphs. Let us give some useful mathematical formalism.
A generic unweighted (or relational) network [9] is represented by a graph
G with N vertices (nodes) and K edges (arcs, links or connections). Such a
graph is described by the so-called adjacency matrix {aij } (also called connection matrix). This is a N · N symmetric matrix, whose entry aij is 1 if there is
an edge joining vertex i to vertex j, and 0 otherwise. An important quantity
of graph G, which will be used in the following of this paper, is the degree of a
generic vertex i, i.e. the number ki of edges incident with vertex i, the number
P
of neighbours of i. We have K = i ki /2 because each link is counted twice,
and the average value of ki is < k >= 2K/N. To define L we need first to
construct the shortest path length dij between two vertices (known in social
networks studies as the number of degrees of separation [6]), measured as the
miminum number of edges traversed to get from a vertex i to another vertex
4
j. By definition dij ≥ 1 with dij = 1 if there exists a direct edge between i
and j. The characteristic path length L of graph G is defined as the average
of the shortest path lengths between two generic vertices:
L(G) =
X
1
dij
N(N − 1) i6=j∈G
(1)
Of course this definition is valid only if G is totally connected, which means
that there must exist at least a path connecting any couple of vertices with
a finite number of steps. Otherwise, when from i∗ we can not reach j ∗ then
di∗ j ∗ = +∞ and consequentely L as given in eq.(1),being divergent, is an illdefined quantity. When studying how the properties of a network are affected
by the removal of nodes, one often incurrs in non-connected networks. In such
cases the alternative formalism in terms of efficiency here proposed is much
more powerful, as will be clarified in the following.
The second measure, the clustering coefficient C, is a local quantity of G
defined as follows. For any node i we consider Gi , the subgraph of neighbors
of i. That is once eliminated i we study how the nodes previously connected
to i remain still connected between each other. If the node i has ki neighbors,
then Gi has ki nodes and at most ki (ki − 1)/2 edges. Ci is the fraction of
these edges that actually exist, and C is the average value of Ci all over the
network:
C(G) =
1 X
Ci
N i∈G
Ci =
# of edges in Gi
ki (ki − 1)/2
(2)
To illustrate the onset of the small-world, Watts and Strogatz have proposed
a one-parameter model (the WS model) to construct a class of unweighted
graphs which interpolates between a regular lattice and a random graph. The
edges of a regular lattice are rewired with a probability p. As the rewiring
probability p increases, the network becomes increasingly disordered and for
p = 1 a random graph is obtained. Although in the two limiting cases large
C is associated to large L (p = 0) and viceversa small C to small L (p = 1),
there is an intermediate regime where the network is a small-world: highly
clustered like a regular lattice and with small characteristic path lengths like
a random graph. In fact only a few rewired edges (0 < p ≪ 1) are sufficient
to produce a rapid drop in L, while C is not affected and remains equal to
the value for the regular lattice [5]. By means of this mathematical formalism
based on the evaluation of L and C, Watts and Strogatz have found three
examples of small-world behavior in real networks: 1) the collaboration graph
of actors in feature films from Ref.[19], as an example of a social system; 2) the
neural network of a nematode, the C. elegans [20] as an example of a biological
network; 3) finally an example of a technological network, the electric power
grid of the western United States.
5
An alternative definition of the small-world behavior has been proposed more
recently by two of us in ref.[7,9] and is based on the definition of the efficiency of a network. Instead of L and C the network is characterized in
terms of how efficiently it propagates information on a global and on a local
scale, respectively. To define the efficiency of G let us suppose that every node
sends information along the network, through its edges. We assume that the
efficiency ǫij in the communication between node i and j is inversely proportional to the shortest distance: ǫij = 1/dij ∀i, j. With this definition, when
there is no path in the graph between i and j, dij = +∞ and consistently
ǫij = 0. The global efficiency of the graph G can be defined as:
Eglob (G) =
P
X 1
ǫij
1
=
N(N − 1)
N(N − 1) i6=j∈G dij
i6=j∈G
(3)
and the local efficiency, in analogy with C, can be defined as the average
efficiency of local subgraphs:
Eloc (G) =
1 X
E(Gi )
N i∈G
E(Gi ) =
X
1
1
ki (ki − 1) l6=m∈Gi d′lm
(4)
where Gi , as previously defined, is the subgraph of the neighbours of i, which
is made by ki nodes and at most ki (ki − 1)/2 edges. It is important to notice
that the quantities {d′lm } are the shortest distances between nodes l and m
calculated on the graph Gi . The two definitions we have given have the important property that both the global and local efficiency are already normalized,
that is: 0 ≤ Eglob (G) ≤ 1 and 0 ≤ Eloc (G) ≤ 1 [21]. The maximum value of
the efficiency Eglob (G) = 1 and Eloc (G) = 1 are obtained in the ideal case
of a completely connected graph, i.e. in the case in which the graph G has
all the N(N − 1)/2 possible edges and dij = 1 ∀i, j. In the efficiency-based
formalism a small-world results as a system with high Eglob (corresponding
to low L) and high Eloc (corresponding to high clustering C), i.e. a network
extremely efficient in exchanging information both on a global and on a local
scale. Moreover the description of a network in terms of its efficiency extends
the small-world analysis also to unconnected networks and, more important,
with only a few modifications, to weighted networks. A weighted network is
a case in which there is a weight associated to each of the edges. Such a network needs two matrices to be described: the usual adjacency matrix {aij }
telling about the existence or not existence of a link (and whose entry aij ,
as for the unweighted case, is 1 when there is an edge joining i to j, and 0
otherwise) and and a second matrix, the matrix of the weights associated to
each link. All the details of the applications of the efficiency-based formalism
to study real weighted networks, e.g. the Boston subway transportation system, can be found in [7–9]. In this paper we focus instead on the simpler case
of unweighted networks: we are in fact interested in the use of the efficiency
6
Fig. 1. The connectivity properties of two graphs G1 and G2, both with N=5 nodes,
are compared. Differently from the efficiency Eglob , the characteristic path length L
is not a representative measure when the graph is unconnected. At the local level,
C is a good approximation of Eloc .
formalism to describe in quantitative terms the global and the local properties
of scale-free networks, and to study how these properties are affected by the
random removal of nodes or by attacks. A simple example will be very useful
to illustrate the comparison between Eglob , Eloc and L, C, and to explain why
the efficiency in many cases works better than L and C, even for unweighted
networks. In particular the differences between the description in term of Eglob
and the description in terms of L are evident when the network is unconnected.
Fig. 1 is an example of the problems associated to the calculations of L when
the graph is unconnected. We consider 2 graphs G1 and G2, both having
the same number of nodes N = 5, but different number of edges. By using
the definition (1) we obtain L1 = 13/10 for the graph on the left hand side
and L2 = ∞ for the graph on the right hand side. An alternative possibility
to avoid the divergence of L2 is to limit the use of definition (1) only to a
part of the graph, the main connected component [22] of G2, which is made
of 3 nodes. In this way we get L2 = 1 and the final information we extract
from the analysis of the characteristic path length is that graph G2 has better
structural properties than graph G1, since L2 < L1 . This is of course wrong
because G1 is certainly much better connected than G2, and the misleading
information comes from the fact that in the second graph we had to remove
two nodes from the analysis. By studying instead the efficiency of the two
graphs we are allowed to take into account also the nodes not connected to
the main connected component: we get (Eglob )1 = 17/20 and (Eglob )2 = 3/10,
in perfect agreement with the fact that G1 has a much better connectivity
(17/20 the efficiency of the completely connected graph) than G2.
On the other side an evaluation of the local clustering of the two graphs
gives: C1 = 4/5, C2 = 3/5, and an evaluation of the local efficiency gives:
(Eloc )1 = 9/10, (Eloc )2 = 3/5. This indicates that the first graph has also
better local properties than the second one. Moreover the variable C is a good
approximation of the local efficiency Eloc (this is in general true when the
subgraphs Gi of a generic node i are composed by small graphs [9]).
7
3
Scale-Free Networks
An important information to characterize a graph G, as previously mentioned,
is the degree of a generic vertex i, i.e. the number ki of edges incident with
vertex i, the number of neighbours of i. Barabasi and collaborators focussed
their attention on P (k), the degree distribution of a network, and showed that
many real large networks, as the World Wide Web, the Internet, metabolic
and protein networks, are scale-free, that is, their degree distribution follows
a power-law for large k [11–16]. Also some social systems of interest for the
spreading of sexually trasmitted diseases [23,24], and the connectivity network of atomic clusters’ systems [25] display a similar behavior. Neither random graphs [17], nor small-world networks constructed according to the WS
model, have a power-law degree distribution P (k) like the one observed in
real large networks. In fact for a random graph P (k) is described by a Poisson
distribution P (k) = < k >k /k! e−<k> , a curve peaked at k =< k > and exponentially decaying for large k, in contrast to the power-law decay of scale-free
graph. This is the reason why random graphs are sometimes referred in the
literature as exponential graphs [14]. Also in the case of the WS small-world
model P (k) is strongly peaked around the average value of k (since it is very
close to the P (k) of regular graphs). Furthermore, even for those real networks for which P (k) is not clearly a power law for all values of k, and has for
instance an exponential cut off for very large k, the degree distribution significantly deviates from the Poisson expected for random graphs [26]. At this
point two natural questions come up to the mind: 1) What is the mechanism
responsible for the emergence of a scale-free structure in such a huge number
of real networks ? 2) What are the main properties of a scale-free topology,
and why is it privileged with respect to the other topologies ?
An answer to the first question and a concrete algorithm to construct a scalefree network has been proposed by Barabasi and collaborators. In Refs.[12,13]
the authors argue that the scale-free nature of real networks is rooted in two
generic mechanisms occurring in many real networks. First of all most realworld networks describe open systems which grow by the continuous addition
of new nodes: as an example the WWW grows exponentially in time by the
addition of new web pages, or the research literature constantly grows by
the publication of new papers. Moreover most real networks exhibit preferential attachment, that is, the likelihood of connecting to a node depends on
the node’s degree. A webpage will most likely include hyperlinks to popular
documents which have already a high degree, because such highly connected
documents are easier to find. A new manuscript will most likely cite a wellknown one increasing furthermore its high number of citations. Growth and
preferential attachment are the two sufficient ingredients to produce a scalefree network. The Barabasi-Albert (BA) model proposed in [12,13] is a simple
way to generate a network with a power-law degree distribution P (k) ∼ k −γ ,
8
and with γ = 3. On the contrary, neither of the two ingredients is present
in the small-world model discussed in Section 2, that assumes instead a fixed
number N of vertices and a probability that two nodes are connected (or their
connection is rewired) independent of the nodes’ degree.
Concerning the second question, the authors of ref.[14] have studied the response of scale-free networks to errors and to attacks. By error and attack
they indicate, respectively, the removal of randomly chosen nodes, and the removal of the most connected nodes. In particular they study the change of the
characteristic path length L when a small fraction of the nodes is eliminated:
in fact the removal of a node in general increases the distance between the
remaining nodes, because it can eliminate paths contributing to the connectivity of the system. Differently from random networks, the scale-free networks
display a high degree of error tolerance, i.e. the ability of their nodes to communicate is unaffected by the failure of some randomly chosen nodes. On the
contrary these networks are extremely vulnerable to attacks, i.e. the removal
of a few nodes that play a vital role in maintaining the network’s connectivity. In practice the presence of the scale-free topology in so many real cases
[11,15,16,23] can be attributed to the need to construct systems with a high
degree of tolerance against errors. Though the error tolerance comes at a high
price in that the scale-free networks are extremely vulnerable to attacks. The
response of scale-free networks to the removal of nodes is also one of the main
points of our paper. In fact, in Section 4 we will extend the analysis of ref.[14],
that was only based on the quantity L, to both the global and local properties
of the network. In order to characterize the local properties of a graph we will
use either C and Eloc . For the global properties we will see that Eglob is better
than L especially when a large number of nodes are removed.
The BA scale-free model reproduces the power-law connectivity distribution,
but not the small-world effect. In fact it produces networks with small average distance between two generic nodes, like a small-world network, but lacks
high clustering, which is typical of a small-world network. On the contrary,
most large real networks with power-law connectivity distribution, shows also
a high clustering coefficient. As an example the values of C obtained from the
two databases of Internet and of the World Wide Web studied, are orders of
magnitude larger than the clustering coefficients for the correspective random
graphs [3]. In order to overcome this problem Klemm and Eguı́luz [18] have
recently proposed an alternative model, the KE model, which produces networks with scale-free degree distributions, small average distances and with
strong clustering. With a minimal amount of changes to the BA model, the KE
model reproduces, at the same time, the two distinct features of real networks:
power-law degree distribution and small-world effect. We do not go into the
details of the KE model now. Since the subject of this paper is the study of
the properties of scale-free networks, in the next section we will discuss how
to construct scale-free networks with the BA model, and scale-free networks
9
with high clustering by means of the KE model.
4
Efficiency in Scale-Free Networks
We are finally ready to study how the efficiency of a network with scale-free
topology is affected by the removal of some of its nodes. We will make use
of the measures defined in formula 3 and in 4, and compare the results with
the ones obtained in terms of L and C. The first step is the construction of a
scale-free network: for this purpose we consider both the BA model and the
KE model.
4.1 Barabasi-Albert (BA) scale-free networks
First we construct the scale-free network following the Barabasi-Albert (BA)
model [12,13]. As previously mentioned the two ingredients of the BA model
are growth and preferential attachment. In fact the algorithm [12] is based on
the iteration of the following two steps:
(1) Addition of nodes: Starting with a small number (m0 ) of nodes, at every
timestep a new node with m(≤ m0 ) edges is added. The edges link the new
node to m different nodes already present in the system.
(2) Preferential attachment of new edges: When choosing the nodes to which
the new node connects, the probability Π that the new node will be connected
to node i is assumed to depend on the degree ki of node i, according to:
ki
Π(ki ) = P
j kj
(5)
After t timesteps the algorithm produces a network with N = t + m0 nodes
and mt edges. The analytical solution of the BA model in the mean field
2m2 t −3
approximation predicts a degree distribution P (k) = m
k , This function
0 +t
asymptotically converges for t → ∞ to a time-independent degree distribution
P (k) ∼ 2m2 k −γ , i.e. to a power law with an exponent γ = 3. It is interesting
to notice that γ does not depend neither on m nor on the size N = m0 + t of
the network.
The mean field predictions are confirmed by other analytical approaches (master equation [27] and rate equation [28]) and by numerical simulations. Both
the two ingredients, growth and preferential attachment, are necessary in the
BA model for the emergence of the power-law scaling. Barabasi et al. have in
fact checked that a model with growth but no preferential attachment gives for
10
0
0
10
10
N=15000
N=5000
SFBA
SFKE
−1
10
−1
10
−3
k
P(k)
−3
k
−2
10
−2
−3
10
10
−3
10
(a)
(b)
−4
10
−4
1
10
10
100
1
k
10
100
k
Fig. 2. Degree distribution for the BA scale-free model (indicated as SFBA with full
circles) and for the KE scale-free model (indicated as SFKE with open squares).
Two system sizes are considered N = 5000, K = 10000 in (a), and N = 15000,
K = 75000 in (b). For N = 5000 the results reported are obtained as averages
over 10 different realizations. While in the case N = 15000 only one realization is
considered. The dashed line is P (k) ∼ k−γ with γ = 3.
t → ∞ an exponential degree distribution. On the other hand, a model with
preferential attachment but no growth predicts that the degree distribution
becomes a Gaussian around its mean value. The BA model can be considered
as a particular case of a model proposed by Simon [29] in 1955 to describe
the scaling behaviour observed in distributions of words frequencies in texts,
and in population figures of cities [30]. The original Simon’s model has been
reformulated recently for networks growth in ref [31].
In Fig. 2 we report the degree distribution of a scale-free network obtained
from the BA model (reported in black dots and indicated as SFBA ). We have
constructed two networks, the first with N = 5000, K = 10000, and the second
with N = 15000, K = 75000. In the first case the results reported are obtained
as averages over 10 different realizations. While in the case N = 15000 only
one realization is sufficient to have a good statistics.
4.2 Klemm-Eguı́luz (KE) scale-free networks
In this section we introduce a different class of scale-free networks with high
clustering coefficient. We follow the method developed by Klemm and Eguı́luz
(KE) in Ref.[18]. In the KE model, each node of the network is assigned a
11
binary state variable and can be either in an active state or in a non-active
state. Taking a completely connected network of m active nodes as initial
condition, the time-discrete dynamics of the KE model is based on the iteration
of the following three steps:
(1) Addition of nodes: A new node with m edges is added to the network.
(2) Preferential attachment: For each of the m edges of the new node it is
decided with a probability µ whether the link connects to one of the active
nodes or if it connects to a non-active node. In the latter case the random
node is chosen according to the same rule of the BA model, the linear preferential attachment of eq. (5), i.e. the probability that node i obtains a link is
P
proportional to the node’s degree: Π(ki ) = ki / j kj . The limit case µ = 1 of
the KE model is the BA model. The limit case µ = 0 is a model with high
clustering but large path length: in fact, as a function of the system size, C
quickly converges to a constant value, whereas L increases linearly [32].
(3) Activation and deactivation of nodes. One of the m active nodes is deactivated: the probability that node i is chosen for deactivation is Πdeact
=
i
−1 P −1
ki / l kl . The new node is set in the active state.
The KE model generates scale-free networks with degree distribution P (k) =
2m2 k −3 (for k ≥ m) and average connectivity < k >= 2K
= 2m [32]. FurN
thermore, by varying µ in the interval [0, 1] the model makes possible to study
the cross-over between a case with high L and C (the model µ = 0 has been
studied previously in ref.[32]), and a case with small L and C (µ = 1 corresponds exactly to the BA model). Klemm and Eguı́luz have shown in Figure 1
of Ref.[18] that a few “long-range” connections are sufficient to have a smallworld transition: in fact, as soon as µ is different from zero, the average shortest
path length L drops rapidly approaching the minimum value of the BA model,
while the clustering coefficient C remains practically constant. Thus the KE
model with µ 6= 0 and µ ≪ 1 reproduces the three generic properties of realworld networks: power law degrees distribution, small L and high C. In our
simulations in the following of this paper, we have used the KE model with
µ = 0.1.
In Fig. 2 we report the degree distribution obtained for two different networks
N = 5000, K = 10000 and N = 15000, K = 75000. Numerical simulations are
shown both for the BA model (full circles) and the KE model (open squares).
A good power-law behavior is obtained with an exponent γ = 3 as expected.
The results for N = 5000 are obtained as averages over 10 different realizations. While in the case N = 15000 only one realization is sufficient to have a
good statistics.
12
8
(a)
7
L6
EXP Failure
EXP Attack
SFBA Failure
SFBA Attack
5
4
0
0.005
0.01
0.015
0.02
0.005
0.01
0.015
0.02
(b)
Eglob 0.2
0.1
0
p
Fig. 3. Resistance to failures and attacks: analysis of the global characteristics. BA
scale-free graphs (SFBA ) are compared with random graphs (EXP). In both cases
we start with two graphs with N = 5000 nodes and K = 10000 edges, and we
remove a fraction p of the nodes with two different prescriptions: failure and attack
(see text). The correlation length L, in panel (a), and the global efficiency Eglob ,
in panel (b), are plotted as function of p. The results reported here and in all the
following figures are averages over 10 different realizations.
4.3 Error and attack tolerance of BA scale-free networks
We are finally ready to address the problem of how the global and the local
properties of a scale-free network are affected by the removal of some of the
nodes. We consider first the class of scale-free networks generated by means
of the BA model of Section 4.1. The malfunctioning of a node in general
makes less efficient the communication between the remaining nodes, because
it can eliminate some of the edges and consequentely some of the paths that
contribute to the interconnectedness of the system. This will affect not only the
global, but also the local properties of the graph (though the latter have never
been addressed in the literature before). As a starting point in our numerical
experiments we consider a BA scale-free network with N = 5000 nodes and
K = 10000 edges, corresponding to < k >= 4. The error and attack tolerance
of this network is compared to that of a random graph with the same number
of nodes and edges. As previously mentioned, the P (k) of a random graph is
a Poisson distribution, a curve which for large k decays exponentially and not
as a power-law. For this reason the random graph is indicated in the figures’
captions as exponential graph (EXP). In removing the nodes, we use two
different strategies. We can simulate an error in the system, as the failure of
13
30
L
EXP Failure
EXP Attack
SFBA Failure
20
SFBA Attack
(a)
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.2
Eglob
0.1
(b)
0
0
0.1
0.2
0.3
0.4
p
0.5
0.6
0.7
0.8
Fig. 4. Resistance to failures and attacks: analysis of the global characteristics. BA
scale-free graphs are compared with random graphs. Same as in previous figure, but
now the whole range of p is considered.
a node chosen at random among all the possible nodes. In alternative we can
simulate an attack on the system by sorting the nodes in order of importance,
according to their degree ki, and then removing them one by one starting
from the node with the highest degree. In fact an agent well informed about
the whole structure of the network and wanting to deliberately damage the
network will not target the nodes randomly, but will preferentially attack the
most connected nodes. Both for failures and attacks a fraction p of the N
nodes is removed and the properties of the networks are studied computing
the two quantities L and C, or the two quantities Eglob and Eloc , as a function
of p (see Section 4.4).
Global properties. In Fig. 3 and in Fig. 4 we report L and Eglob as a function of the fraction p of nodes removed. We first perform the same analysis of
ref.[14] by studying the changes in the characteristic path length L. The scalefree graph considered initially has L ∼ 4.6 (on average, two generic nodes can
be connected in less than 5 steps), a value lower than that of the random graph
(L ∼ 6.7). In the upper part of Fig. 3, we observe for the exponential network
a slow monotonic increase of L with p (for p ≪ 1), both for failures and for
attacks. In practice there is no substantial difference whether the nodes are
selected randomly or in decreasing order of connectivity. This behaviour is
rooted in the homogeneity of the network: since all nodes have approximately
the same number of links, they all contribute equally to the network characteristic path length, thus the removal of a generic node or the best connected
one causes about the same amount of damage. On the other hand we oberve
14
0.01
(a)
0.008
0.006
EXP Failure
EXP Attack
SFBA Failure
C
0.004
SFBA Attack
0.002
0
0
0.01
0.005
0.01
0.015
0.02
0.005
0.01
0.015
0.02
(b)
0.008
0.006
Eloc
0.004
0.002
0
0
Fig. 5. Resistance to failures and attacks: analysis of the local characteristics. BA
scale-free graphs are compared with random graphs. In both cases we consider two
graphs with the same initial number of nodes N = 5000 and edges K = 10000. The
clustering coefficient C, in panel (a), and the local efficiency Eloc , in panel (b), are
plotted as function of p, the fraction of nodes removed.
a drastically different behaviour for scale-free networks (the same observed in
[14]): L remains almost unchanged under an increasing level of errors, while
it increases rapidly when the most connected nodes are eliminated. For example, when 2% of the nodes fails (p = 0.02), the communication between the
remaining nodes in the network is unaffected, while, when the 2% of the most
connected nodes is removed, then L almost doubles its original value. This
robustness to failures and at the same time vulnerability to attacks is rooted
in the inhomogeneity of the connectivity distribution P (k): the connectivity is
maintained by a few highly connected nodes, whose removal drastically alters
the network’s topology, and decreases the ability of the remaining nodes to
communicate with each other.
In the following we show that this behavior can be better quantified by using
Eglob , since the variable in formula 3 is normalized to the ideal case, obtained
when all the N(N − 1)/2 links are present in the graph. In the lower part
of Fig. 3 we observe that initially the scale-free graph has Eglob = 0.24 and
the random graph has Eglob = 0.15, respectively 24% and 15% the efficiency
of a completely connected graph. When p = 0.02 and the nodes are removed
under attack (i.e. according to their degree), the efficiency of the scale-free
graph has rapidly decreased to Eglob = 0.12: by attacking only a tiny fraction
of nodes as the 2%, the scale-free network has already lost 50% of its efficiency.
Conversely the global efficiency of the scale-free graph does not vary a lot in
the case of failures. The same thing happens for the exponential graph, where
15
0.01
(a)
0.008
EXP Failure
EXP Attack
SFBA Failure
0.006
SFBA Attack
C 0.004
0.002
0
0
0.01
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(b)
0.008
0.006
Eglob
0.004
0.002
0
0
p
Fig. 6. Resistance to failures and attacks: analysis of the local characteristics. BA
scale-free graphs are compared with random graphs. Same as in previous figure, but
now the whole range of p is considered.
the communication between the remaining nodes of the network is unaffected
either from failures and from attacks.
In so far we have only considered the removal of a small percentage of nodes.
What happens now if we extend the analysis to larger values of p, even to values of the order of 1 ? In this case, it will become evident that the efficiency
variable is a better quantity to study. In fact, for large values of p, we have to
deal with the problem of the graph becoming unconnected. In the upper part
of Fig. 4, we observe that L reaches very high values when more and more
nodes are removed. In practice, as explained in Section 2, a straighforward
application of the definition in formula (1) would give L = ∞ for p larger
than a certain value p∗ for which the graph becomes unconnected. To avoid
this divergence we have to limit the use of definition (1) only to a part of the
graph, the main connected component (as also done in [14]). In this way for
different values of p we compare graphs with different number of nodes, and
this can give unrealistic results (see Fig. 1) as the maxima of L observed in
Fig. 4(a). See for example the BA scale-free network (SFBA ) under attacks:
we have L = 30 for p = 0.1 and then a rapid drop to L = 4 for p = 0.2. This
effect indicates that the network for p = 0.1 starts to fragment into many unconnected small parts (each with more or less the same size) as evidenced from
the cluster size distribution studied in Ref. [14], but at the same time makes
unfeasible the comparison of the connectivity properties of graphs with different p. In fact the misleading information we get from L is that, by increasing
p, i.e. by removing more nodes we can get a network with better connectivity
(shorter L). In reality, when we want to compare graphs with p varying in a
16
wide range of values, it is better to use the efficiency variable. In the lower
part of Fig. 4, one can clearly see that, evaluating Eglob as a function of p
we get four monotonically decreasing curves, and we avoid the problem of the
unphysical change of slope of L. Again we notice the rapid drop in the global
efficiency of a scale-free network under attack: the removal of the 10% of the
nodes completely destroys the global efficiency that drops to values Eglob ∼ 0.
The removal of nodes by failure produces instead a slower decreases of Eglob
with p. When we compare these two curves with the two analog curves obtained for an exponential graph, we observe that in the case of a random graph
the difference between failure and attack is less pronounced (though clearly
visible on such a scale of p, while it was not visible in the short range p scale
used in Fig. 3(b)) than in the case of the SFBA network. This means that,
besides the sudden drop of Eglob observed for SFBA under attack there are no
other qualitative differences between scale-free and random graphs when their
properties are compared on a large scale of p.
The results we have reported in Fig. 3 and Fig. 4 are averages over 10 different realizations. The average makes no important differences in the case of
the global properties, although can be very important for the local quantities,
which are in general affected by larger fluctuations.
Local properties. In Fig. 5 and in Fig. 6 we report C and Eloc as a function
of the nodes removed. We start, as before, with two networks, a BA scalefree and a random graph, with N = 5000 nodes and K = 10000 links. Of
course both the networks considered have, by construction, a very small local
clustering, as indicated by the small values of C (0.007 for the BA scale-free
network and less than 0.001 for the random graph) or by the small values of
Eloc (again 0.007 for the BA scale-free network and less than 0.001 for the
random graph). The first thing to notice is that, in agreement with what said
in Section 2, the values of C and Eloc are very similar. In fact we expect C to be
a reasonable approximation for Eloc when the subgraphs Gi of the neighbours
of a generic node i are composed by very small graphs [7,9]. This is the case
for both the random graph, and also the scale-free network of the BA model
(things will be different for KE scale-free networks). Since the local clustering
is very small we have large fluctuations among different realizations, and we
must consider an average over different realizations to obtain stable results.
The curves reported in Fig. 5 and in Fig. 6 are averages over 10 different
realizations. Though the local clustering of the two networks is very small, we
observe a rapid drop in the local efficiency (similarly to that observed for the
global efficiency) of a scale-free network under attacks.
17
(a)
12
L
10
EXP Failure
EXP Attack
SFKE Failure
8
SFKE Attack
6
0
0.005
0.01
0.015
0.02
0.005
0.01
0.015
0.02
0.16
0.14
Eglob
0.12
0.1
(b)
0.08
0
Fig. 7. Resistance to failures and attacks: analysis of the global characteristics.
KE scale-free graphs are compared with random graphs. In both cases we have two
graphs with the same initial number of nodes N = 5000 and edges K = 10000. The
correlation length L and the global efficiency Eglob are plotted as function of p, the
fraction of nodes removed.
4.4 Error and attack tolerance of KE scale-free networks
We now repeat the same analysis for the class of scale-free networks generated
by the KE model, i.e. for networks with power law degree distribution and
at the same time strong clustering. We can consider such networks as smallworlds with power-law degree distribution. We start by considering a KE scalefree (SFKE ) network with N = 5000 nodes and K = 10000 edges, generated
by the prescription of the KE model of Section 4.2 with µ = 0.1. (such a
scale-free network has also small-world properties, in fact it has Eglob = 0.12
and Eloc = 0.54). As in the previous section we remove the nodes by using the
two different strategies simulating failures or attacks, and we investigate how
the properties of the network change by reporting as a function of p the two
quantities L and C, or the two quantities Eglob and Eloc .
Global properties. In Fig. 7 and in Fig. 8 we report L and Eglob as a function of the fraction p of nodes removed. The KE scale-free graph considered
initially has now L ∼ 9.5 (two generic nodes can be connected in an average
of 10 steps). This value is higher than the value obtained for SFBA networks
(L ∼ 4.6), and also higher than that of random graphs (L ∼ 6.7). This is of
course the price to pay to have a strong local clustering: the increase in local
connectivity is obtained at the expenses of the global connectivity. In any case,
the results are similar to those obtained for the BA scale-free networks, though
18
30
(a)
25
EXP Failure
EXP Attack
SFKE Failure
20
SFKE Attack
L 15
10
5
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.16
0.12
Eglob 0.08
0.04
(b)
0
0
p
Fig. 8. Resistance to failures and attacks: analysis of the global characteristics. KE
scale-free graphs are compared with random graphs. Same as in previous figure, but
now the whole range of p is considered.
the difference between scale-free and exponential network is less marked. In
the upper part of Fig. 7 we observe on one hand that the exponential network
has a slow monotonic increase of L with p (for p ≪ 1), both for failures and for
attacks, and on the other hand that for scale-free networks L remains almost
unchanged under an increasing level of errors, while it increases rapidly when
the most connected nodes are eliminated. In the lower part of figure Fig. 7 we
see that the same behavior is confirmed when the global connectivity of the
graph is expressed in terms of the efficiency Eglob : the initial efficiency of the
scale-free graph Eglob = 0.12 (12% the efficiency of the completely connected
graph) decreases to Eglob = 0.08 by attacking the 2% of the nodes (though
this results is not as drastic as in the case of BA networks, compare with
Fig. 3). The global efficiency of the scale-free graph does not vary a lot in the
case of failures. In Fig. 8 we consider a larger range of values of p. From panel
(a) we see again that the correct variable to evaluate is Eglob and not L. In
fact L would give unphysical result as the presence of a spurious maximum
when the network becomes unconnected. From the plot of Eglob versus p in
Fig. 8(b) we observe that the KE scale-free and the exponential graph have a
similar behavior as a function of p, when compared on the whole scale of p,
apart from a different normalization factor, i.e. a different value at p = 0. A
qualitatively different behavior in the global properties of KE scale-free and
exponential graphs is observed only for p < 0.02 (compare Fig. 7 to Fig. 8),
i.e. only when a very small fraction of nodes is removed.
19
0.5
0.4
0.3
EXP Failure
EXP Attack
SFKE Failure
C 0.2
0.1
(a)
SFKE Attack
0
-0.1
0
0.005
0.01
0.015
0.02
0.005
0.01
0.015
0.02
0.5
0.4
Eloc
0.3
0.2
0.1
(b)
0
-0.1
0
p
Fig. 9. Resistance to failures and attacks: analysis of the local characteristics. KE
scale-free graphs are compared with random graphs. In both cases we have two
graphs with the same initial number of nodes N = 5000 and edges K = 10000. The
clustering coefficient C and the local efficiency Eloc are plotted as function of p, the
fraction of nodes removed.
Local properties. In Fig. 9 and in Fig. 10 we report C and Eloc as a function
of the nodes removed. We observe that the KE scale-free network has a good
local connectivity expressed by a clustering coefficient C = 0.43 and/or Eloc =
0.54 (meaning that the graph has 54% of the local efficiency of the completely
connected graph). Notice that, for KE scale-free networks the numerical values
of Eloc and C are not similar to each other, as they were in BA scale-free
networks. In fact for SFKE networks the subgraph Gi of the neighbours of
a generic node i is not always a very small graph and therefore C is not a
good approximation of Eloc anymore [7,9]. Though the numerical value of C
is different from that of Eloc , the information we get from the behavior of
these two quantities as a function of p is similar. We observe, both in Fig. 9
and in Fig. 10, a rapid decrease in the local efficiency (and in the clustering
coefficient C) of SFKE networks under attacks, while the local efficiency (and
C) decreases much slower under failures. Eloc (p) and C for random graphs,
(the same curves were plotted in Fig. 5 and Fig. 6 in larger scale), are here
order of magnitude smaller than the values of the local efficiency of SFKE
networks, and are practically indistinguishable from zero in the scale adopted.
20
0.5
(a)
0.4
EXP Failure
EXP Attack
SFKE Failure
0.3
SFKE Attack
C 0.2
0.1
0
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
(b)
0.5
0.4
Eloc
0.3
0.2
0.1
0
-0.1
0
0.1
0.2
0.3
0.4
p
0.5
0.6
0.7
0.8
Fig. 10. Resistance to failures and attacks: analysis of the local characteristics. KE
scale-free graphs are compared with random graphs. Same as in previous figure, but
now the whole range of p is considered.
5
Conclusions
In this paper we have studied the effects of errors and attacks on the efficiency of scale-free networks. Two different kinds of scale-free networks have
been considered and compared to random graphs: scale-free networks with
no local clustering produced by the Barabasi-Albert (BA) model, and scalefree networks with high clustering properties as in the model by Klemm and
Eguı́luz (KE). By using as mathematical measures the global and the local
efficiency, we have investigated the effects of errors and attacks both on the
global and on the local properties of the network. We have found that both
the global and the local efficiency of scale-free networks are unaffected by the
failure of some of the nodes, i.e. when some (up to 2%) of the nodes are chosen
at random and removed. On the other hand, at variance with random graphs,
in scale-free networks the global and the local efficiency rapidly decrease when
the nodes removed are those with higher connectivity ki , i.e. scale-free networks are extremely sensitive to attacks. These properties are true both for
BA networks and for KE networks, though KE networks have higher local
efficiency but lower global efficiency than BA networks. We have also studied the effects of errors an attacks when a large number of nodes (even up
to 80% of the nodes of the network) are removed. On a such a larger scale
of p the difference between scale-free networks and random graph is less pronounced than in the smaller scale p < 0.02. When a large number of nodes
21
are removed, especially when the network become unconnected, the efficiency
variable is definitely a better quantity than the characteristic path length L
to measure the response of the networks to external factors.
References
[1] Y. Bar-Yam, Dynamics of Complex Systems (Addison-Wesley, Reading Mass,
1997).
[2] S.H. Strogatz, Nature 410, 268 (2001).
[3] R. Albert and A.-L. Barabási, Reviews of Modern Physics 74, 47 (2002).
[4] M.E.J. Newman, J. Stat. Phys. 101, 819 (2000).
[5] D.J. Watts and S.H. Strogatz, Nature 393, 440 (1998).
[6] S. Milgram, Psychol. Today, 2, 60 (1967).
[7] V. Latora and M. Marchiori Phys. Rev. Lett. 87, 198701 (2001).
[8] V. Latora and M. Marchiori, cond-mat/0202299, Proceedings of the
International Conference “Horizons in Complex Systems”, Messina December
2001, to appear on Physica A.
[9] V. Latora and M. Marchiori, cond-mat/0204089 and submitted to Phys. Rev.
E.
[10] M. Marchiori and V. Latora, Physica A285, 539 (2000).
[11] R. Albert, H. Jeong, and A.-L. Barabási, Nature 401, 130 (1999).
[12] A.-L. Barabási and R. Albert, Science 286, 509 (1999).
[13] A.-L. Barabási, R. Albert and H. Jeong, Physica A272, 173 (1999).
[14] R. Albert, H. Jeong, and A.-L. Barabási, Nature 406, 378 (2000); Correction
Nature 409, 542 (2001).
[15] H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai and A.-L. Barabási, Nature 407,
651 (2000).
[16] H. Jeong , S.P.Mason, A.-L. Barabási and Z.N. Oltvai, Nature 411, 41 (2001).
[17] B. Bollobás, Random Graphs (Academic, London, 1985).
[18] K. Klemm, V.M. Eguı́luz, Phys. Rev. E65, 057102 (2002).
[19] The Internet Movie Database, http://www.imdb.com
[20] T.B. Achacoso and W.S. Yamamoto, AY’s Neuroanatomy of C. elegans for
Computation (CRC Press, Boca Raton, FL, 1992).
22
[21] The formalism can be easily extended to the case of weighted networks [7,9].
Since in this paper we are interested in the study of unweighted networks we
have directly presented the definition of the efficiency in the particular and
simpler case of unweighted networks. In the general definition valid for weighted
and unweighted networks a normalization factor has to be introduced to have:
0 ≤ Eglob (G) ≤ 1 and 0 ≤ Eloc (G) ≤ 1 (see refs. [7,9]).
[22] As done for example in ref. [5] when the collaboration network of movie actors
is studied, or in all the examples of [14].
[23] F.L. Liljeros, C.R. Edling, N. Amaral, H.E. Stanley, and Y. Aberg, Nature 411,
907 (2001).
[24] R. Pastor-Satorras and A. Vespignani, Phys. Rev. Lett. 86, 3200 (2001).
[25] J.P.K. Doye, cond-mat/0201430.
[26] L. A. N. Amaral, A. Scala, M. Barthélémy, and H. E. Stanley, Proc. Natl. Acad.
Sci. 97, 11149 (2000).
[27] S. Dorogovtsev, J. Mendes and A.N. Samukhin Phys. Rev. Lett. 85, 4633 (2000).
[28] P.L. Krapivsky, S. Redner, F. Leyvraz Phys. Rev. Lett. 85, 4629 (2000).
[29] H.A. Simon, Biometrika 42, 425 (1955).
[30] Zipf, G.K., Human Behaviour and the Principle of Least Effort (AddisonWesley, Cambridge, Massachusetts, 1949).
[31] S. Bornholdt, H. Ebel, Phys. Rev. E64, 035104(R) (2001).
[32] K. Klemm, V.M. Eguı́luz, Phys. Rev. E65, 036123 (2002).
23