CH 06

CSC 407/607
Network Analysis
Chapter 6, Cascade and Epidemics
All chapter slides adopted from

Mehmet H Gunes, Constantine
Dovrolis, Eileen Kraemer, Peter Dodds,
Sergei Maslov, and Jure Leskovec, and
textbook
1
Models of Influence
• First mathematical models

– [Schelling '70/'78, Granovetter '78]
• Large body of subsequent work:
– [Rogers '95, Valente '95, Wasserman/Faust '94]
• Two basic classes of diffusion models: threshold and
cascade
• General operational view:
– Nodes start either active or inactive.
– An active node may trigger activation of neighboring
nodes.
– Think of it as a game of activating the nodes as it spreads
throughout the graph.
– Sometimes Monotonicity assumption: active nodes never
deactivate.
2
Information cascade
• Users often repost content posted

by others in the network.
– Content is often received via immediate
neighbors (friends).
• Information propagates through
friends.
An information cascade: a piece of information/decision cascaded among

some users, where
• individuals are connected by a network and
• individuals are only observing decisions of their immediate
neighbors (friends).
3
Information cascade
• Network is a directed graph.

– Nodes are actors.
– Edges depict the communication channels between them.
• A node can only influence nodes that it is directly

connected to.
• Decisions are binary. Nodes can

– Active: the node has adopted the behavior/innovation/decision.
– Inactive: otherwise
• An activated node can activate its neighboring nodes; and

activation is a progressive process, where nodes change
from inactive to active.
4
Independent Cascade Model (ICM)
• Independent Cascade Model (ICM)

– Sender-centric model of cascade
– Each node has one chance to activate its neighbors.
• In ICM, nodes that are active are senders and nodes that
are being activated are receivers.
• Algorithm:
– Node activated at time 𝑡, has one chance, at time step 𝑡 + 1, to
activate its neighbors.
– Assume 𝑣 is activated at time 𝑡
• For any neighbor 𝑤 of 𝑣, there’s a probability 𝑝𝑣𝑤 that node
𝑤 gets activated at time 𝑡 + 1
– Node 𝑣 activated at time 𝑡 has a single chance of activating its
neighbors
• The activation can only happen at 𝑡 + 1
5
ICM algorithm
6
ICM example
Obviously, some nodes

might never get activated.
Hence, interesting to study
the active ratio. 7
ICM example 2
0.6
Inactive Node
0.3 0.2 0.2 Active Node
Newly active
X 0.1 U
0.4 node
Successful
0.5 0.3 0.2
attempt
Unsuccessful
0.5 attempt
w
v
Algorithm should stop after all active
Stop! nodes had a one-time shot at activating
their neighbors.
8
Linear Threshold Model (LTM)
• A node 𝑣 has random threshold 𝜃𝑣 ∼ 𝑈 0,1
• Node 𝑣 is influenced by each neighbor 𝑤 according
to a weight 𝑏𝑣,𝑤 such that
෍ 𝑏𝑣,𝑤 ≤ 1
𝑤∈𝑁𝑣
• A node 𝑣 becomes active when at least
(weighted) 𝜃𝑣 fraction of its neighbors are active
෍ 𝑏𝑣,𝑤 ≤ 1
(active)
𝑤∈𝑁𝑣 𝑁𝑣 : all neighbors of 𝑣.
(active)
𝑁𝑣 : all active neighbors of 𝑣.
9
LTM example
Inactive Node
0.6
Active Node
0.3 0.2 0.2 Threshold
X Active neighbors
0.1
0.4 U
0.5 0.3
0.2 Stop!
0.5
w v
v
The stoppage of the algorithm is less
clear. What now?
10
Spread of viral messages
• We move on to study how viral messages spread in graphs.
• Imagine a problem: Given

– Graph G and a limited budget B for initial advertising
• e.g. give away free samples of product
– Estimates for influence between individuals
• Could be inaccurate or even dynamic
• Goal
– trigger a large cascade of influence
• e.g. further adoptions of a product
• Question
– Which set of individuals should B target at?
• Application besides product marketing
– spread a political opinion.
– detect stories in blogs. 11
Mapped to cascade
• Maximizing the Spread of

Cascades: the problem of
ﬁnding a small set of nodes
in a social network such that
their aggregated spread in
the network is maximized.
• In real world, multiple online

social networks are used:
Twitter, Facebook, Reddit,
etc.
12
Constrained optimization
• Spread of node set 𝑆: 𝑓(𝑆)

– An expected number of activated nodes at the end of the
cascade, if set 𝑆 is the initial active set.
• Problem:
– Given a parameter 𝑘 (budget), find a 𝑘-node set 𝑆 to maximize
𝑓(𝑆)
– A constrained optimization problem with 𝑓(𝑆) as the objective
function.
13
f(S) properties
1. Non-negative (obviously)
2. Monotone
f (S  v)  f (S )
3. Submodular
– Let 𝑁 be a finite set
– A set function 𝑓 is submodular if and only if
f : 2N 
S  T  N , v  N \ T ,
f ( S  v )  f ( S )  f (T  v )  f (T )
14
NP-completeness
• Bad News
– For a submodular function monotone non-negative 𝑓, finding a 𝑘-
element set 𝑆 for which 𝑓(𝑆) is maximized is an NP-hard problem.
– It is NP-hard to determine the optimum for influence maximization
for independent cascade.
• Good News
– We can use a greedy algorithm:
• Start with an empty set 𝑆
• For 𝑘 iterations:
Add node 𝑣 to 𝑆 that maximizes 𝑓(𝑆 ∪ {𝑣}) − 𝑓(𝑆).
– How good (or bad) it is?
• Theorem: the greedy algorithm provides a (1 – 1/𝑒)
approximation
• The resulting set 𝑆 activates at least (1 − 1/𝑒) > 63% of the
number of nodes that any size-𝑘 set 𝑆 could activate. 15
Greedy algorithm
16
Network game
• Choice between two things, 𝐴 and 𝐵

– e.g. basketball and soccer
• if friends choose 𝐴, they get payoff 𝑎

• if friends choose 𝐵, they get payoff 𝑏
• if one chooses 𝐴 while the other chooses 𝐵, their payoff

is 0.
17
Choose with friends
Let A = basketball, B = soccer. Which one should you play?
fraction p = 3/5 play basketball fraction p = 2/5 play soccer

18
Higher payoff selection
• 𝑑 neighbors
• 𝑝 fraction play basketball
(A)
• 1 − 𝑝 fraction play soccer
(B)
• if choose A, get payoff

𝑝𝑑𝑎
• if choose B, get payoff
1 − 𝑝 𝑑𝑏
• Decide based on
expected payoff. Choose
A if
𝑝𝑑𝑎 ≥ 1 − 𝑝 𝑑𝑏
𝑏
i.e. 𝑝 ≥ 𝑎+𝑏
19
Equilibrium
• Possible to reach equilibriums:

– Everyone chooses A; or
– Everyone chooses B.
• What if two nodes switch at random? Will a cascade occur?
• example:
– a = 3, b = 2
– payoff for nodes interaction using behavior A is 3/2

• as large as what they get if they both choose B.
– nodes will switch from B to A if at least

20
q = 2/(3+2) = 2/5 of their neighbors are playing A.
How does it spread in graphs?
• suppose 2 users start playing basketball due to external

factors
– e.g. they are bribed with a free pair of basketball shoes by a
certain company.
Now suppose you

can choose two
users (Budget B)
from the network
to start. Which two
will you choose?
21
Epidemics
• Epidemics describes the process by which diseases

spread. This process consists of
– A pathogen
• The disease being spread.
• Tweet being retweeted.
– A population of hosts
• Humans, animals, plants, etc.
• Epidemic models assume an implicit network and

unknown connections between users.
• Epidemic models are more suitable when we are

interested in global patterns,
– But not who infected whom, i.e., individualism not interesting.
22
Epidemics model: SIS
Cured
• The network consists of nodes (individual)
and edges (links between nodes). S I
• Each node is in one of two states
– Susceptible (in other words, healthy) Infected by neighbor
– Infected
• Susceptible-Infected-Susceptible (SIS)
model
– Cured nodes immediately become susceptible
and may get infected at some point.
– New infection may take place any time.
23
Birth/death rates
• Homogeneous birth rate 𝛽 on all edges between infected

and susceptible nodes.
• Homogeneous death rate 𝛿 for infected nodes.
– So “birth” is about getting infected, “death” is about getting cured!
Healthy
N2
Prob. δ
Prob. β
N1 X
Infected
N3
24
SIR vs. SIS
Cured and
• Susceptible-Infected-Removed become immune
– Sometimes called Recovered.
• Cured users become immune, so S I R
they will never be infected again.
Infected by neighbor
• Homogeneous death rate 𝛿 for
infected nodes.
25
Modeling SI
𝐼
• 𝑁: size of the crowd S I
• 𝑆(𝑡): number of susceptible individuals at time 𝑡

– rate: 𝑠(𝑡) = 𝑆(𝑡)/𝑁
• 𝐼(𝑡): number of infected individuals at time 𝑡
– rate: 𝑖(𝑡) = 𝐼(𝑡)/𝑁
• : Contact probability
– if  = 1 everyone comes to contact with everyone else.
– if  = 0 no one meets another individual.
– Real world scenario 0 < 𝛽 < 1
𝑁 = 𝑆(𝑡) + 𝐼(𝑡)
26
Modeling SI, cont’d
• At each time stamp, an infected individual will meet

𝑁 people on average and will infect 𝑆 of them
• Since 𝐼 are infected, 𝐼𝑆 will be infected in the next time
step (so assuming no overlaps)
𝐼0 : initial
number of
infected.
(𝑆 + 𝐼 = 𝑁)
27
SI graph
Logistic growth function compared to the HIV/AIDS growth in the United States
28
Modeling SIR
𝐼 𝛾𝐼
• 𝑁: size of the crowd S I R
• 𝑆(𝑡): number of susceptible

individuals at time 𝑡
– 𝑠(𝑡) = 𝑆(𝑡)/𝑁
• 𝐼(𝑡): number of infected
– 𝑖(𝑡) = 𝐼(𝑡)/𝑁
• R(𝑡): number of recovered
– 𝑟(𝑡) = 𝑅(𝑡)/𝑁
𝐼(𝑡) + 𝑆(𝑡) + 𝑅(𝑡) = 𝑁 𝛾 : recovering

probability of an
infected.
29
Modeling SIR, cont’d
(𝑅0 = 0)
30
SIR graph
S R
SIR model simulated with S0 = 99, I0 = 1, R0 = 0,  = 0.01, and  = 0.1
31
Modeling SIS
𝛾𝐼
𝐼
S I
• The SIS model is the same as the SI
model with the addition of infected nodes
recovering and becoming susceptible
again.
32
Modeling SIS, cont’d
𝛾
Case 1: When 𝑵 ≤ 𝜸 (or when 𝑁 ≤ ):

– The first term will be at most zero or negative
– The whole term becomes negative
– In the limit, I(t) will decrease exponentially to zero
𝛾
Case 2: When 𝑵 > 𝜸 (or when 𝑁 > ):

– We will have a logistic growth function like the SI model
33
SIS graph
S I
34
SIS model simulated with S0 = 99, I0 = 1,  = 0.01, and  = 0.1
Modeling SIRS
𝜆𝑅
• The individuals who have recovered
will lose immunity after a certain 𝐼 𝛾𝐼
S I R
period of time and will become
susceptible again.
35
SIRS graph
S
R
I
SIRS model simulated with S0 = 99, I0 = 1, R0 = 1,  = 0.01, 𝜆 = 0.02, and  = 0.1 36

Furthering epidemics analysis
• Many things to study:

– How fast will users be infected?
– At what point will a certain percentage of users be infected?
– How does a node’s position in the graph change its infection?
• Epidemic analysis can also be used to study information or

opinion spread.
37

CH 06

Uploaded by

Copyright:

Available Formats

CH 06

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 06

Uploaded by

Copyright:

Available Formats

CSC 407/607

Chapter 6, Cascade and Epidemics

All chapter slides adopted from

• First mathematical models

• Users often repost content posted

An information cascade: a piece of information/decision cascaded among

• Network is a directed graph.

• A node can only influence nodes that it is directly

• Decisions are binary. Nodes can

• An activated node can activate its neighboring nodes; and

• Independent Cascade Model (ICM)

Obviously, some nodes

0.3 0.2 0.2 Active Node

0.3 0.2 0.2 Threshold

• We move on to study how viral messages spread in graphs.

• Imagine a problem: Given

• Maximizing the Spread of

• In real world, multiple online

• Spread of node set 𝑆: 𝑓(𝑆)

• Choice between two things, 𝐴 and 𝐵

• if friends choose 𝐴, they get payoff 𝑎

• if one chooses 𝐴 while the other chooses 𝐵, their payoff

Let A = basketball, B = soccer. Which one should you play?

fraction p = 3/5 play basketball fraction p = 2/5 play soccer

• if choose A, get payoff

• Possible to reach equilibriums:

• What if two nodes switch at random? Will a cascade occur?

– payoff for nodes interaction using behavior A is 3/2

– nodes will switch from B to A if at least

• suppose 2 users start playing basketball due to external

Now suppose you

• Epidemics describes the process by which diseases

• Epidemic models assume an implicit network and

• Epidemic models are more suitable when we are

• Homogeneous birth rate 𝛽 on all edges between infected

• 𝑆(𝑡): number of susceptible individuals at time 𝑡

• At each time stamp, an infected individual will meet

• 𝑆(𝑡): number of susceptible

𝐼(𝑡) + 𝑆(𝑡) + 𝑅(𝑡) = 𝑁 𝛾 : recovering

SIR model simulated with S0 = 99, I0 = 1, R0 = 0,  = 0.01, and  = 0.1

SIRS model simulated with S0 = 99, I0 = 1, R0 = 1,  = 0.01, 𝜆 = 0.02, and  = 0.1 36

• Many things to study:

• Epidemic analysis can also be used to study information or

You might also like