CH 06

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 37

CSC 407/607

Network Analysis

Chapter 6, Cascade and Epidemics

All chapter slides adopted from


Mehmet H Gunes, Constantine
Dovrolis, Eileen Kraemer, Peter Dodds,
Sergei Maslov, and Jure Leskovec, and
textbook
1
Models of Influence

• First mathematical models


– [Schelling '70/'78, Granovetter '78]
• Large body of subsequent work:
– [Rogers '95, Valente '95, Wasserman/Faust '94]
• Two basic classes of diffusion models: threshold and
cascade
• General operational view:
– Nodes start either active or inactive.
– An active node may trigger activation of neighboring
nodes.
– Think of it as a game of activating the nodes as it spreads
throughout the graph.
– Sometimes Monotonicity assumption: active nodes never
deactivate.
2
Information cascade

• Users often repost content posted


by others in the network.
– Content is often received via immediate
neighbors (friends).
• Information propagates through
friends.

An information cascade: a piece of information/decision cascaded among


some users, where
• individuals are connected by a network and
• individuals are only observing decisions of their immediate
neighbors (friends).
3
Information cascade

• Network is a directed graph.


– Nodes are actors.
– Edges depict the communication channels between them.

• A node can only influence nodes that it is directly


connected to.

• Decisions are binary. Nodes can


– Active: the node has adopted the behavior/innovation/decision.
– Inactive: otherwise

• An activated node can activate its neighboring nodes; and


activation is a progressive process, where nodes change
from inactive to active.
4
Independent Cascade Model (ICM)

• Independent Cascade Model (ICM)


– Sender-centric model of cascade
– Each node has one chance to activate its neighbors.
• In ICM, nodes that are active are senders and nodes that
are being activated are receivers.
• Algorithm:
– Node activated at time 𝑡, has one chance, at time step 𝑡 + 1, to
activate its neighbors.
– Assume 𝑣 is activated at time 𝑡
• For any neighbor 𝑤 of 𝑣, there’s a probability 𝑝𝑣𝑤 that node
𝑤 gets activated at time 𝑡 + 1
– Node 𝑣 activated at time 𝑡 has a single chance of activating its
neighbors
• The activation can only happen at 𝑡 + 1

5
ICM algorithm

6
ICM example

Obviously, some nodes


might never get activated.
Hence, interesting to study
the active ratio. 7
ICM example 2

0.6
Inactive Node

0.3 0.2 0.2 Active Node

Newly active
X 0.1 U
0.4 node
Successful
0.5 0.3 0.2
attempt
Unsuccessful
0.5 attempt
w
v
Algorithm should stop after all active
Stop! nodes had a one-time shot at activating
their neighbors.
8
Linear Threshold Model (LTM)
• A node 𝑣 has random threshold 𝜃𝑣 ∼ 𝑈 0,1
• Node 𝑣 is influenced by each neighbor 𝑤 according
to a weight 𝑏𝑣,𝑤 such that

෍ 𝑏𝑣,𝑤 ≤ 1
𝑤∈𝑁𝑣
• A node 𝑣 becomes active when at least
(weighted) 𝜃𝑣 fraction of its neighbors are active
෍ 𝑏𝑣,𝑤 ≤ 1
(active)
𝑤∈𝑁𝑣 𝑁𝑣 : all neighbors of 𝑣.
(active)
𝑁𝑣 : all active neighbors of 𝑣.
9
LTM example

Inactive Node
0.6
Active Node

0.3 0.2 0.2 Threshold

X Active neighbors
0.1
0.4 U

0.5 0.3
0.2 Stop!
0.5
w v
v
The stoppage of the algorithm is less
clear. What now?
10
Spread of viral messages

• We move on to study how viral messages spread in graphs.

• Imagine a problem: Given


– Graph G and a limited budget B for initial advertising
• e.g. give away free samples of product
– Estimates for influence between individuals
• Could be inaccurate or even dynamic
• Goal
– trigger a large cascade of influence
• e.g. further adoptions of a product
• Question
– Which set of individuals should B target at?
• Application besides product marketing
– spread a political opinion.
– detect stories in blogs. 11
Mapped to cascade

• Maximizing the Spread of


Cascades: the problem of
finding a small set of nodes
in a social network such that
their aggregated spread in
the network is maximized.

• In real world, multiple online


social networks are used:
Twitter, Facebook, Reddit,
etc.

12
Constrained optimization

• Spread of node set 𝑆: 𝑓(𝑆)


– An expected number of activated nodes at the end of the
cascade, if set 𝑆 is the initial active set.

• Problem:
– Given a parameter 𝑘 (budget), find a 𝑘-node set 𝑆 to maximize
𝑓(𝑆)
– A constrained optimization problem with 𝑓(𝑆) as the objective
function.

13
f(S) properties
1. Non-negative (obviously)

2. Monotone
f (S  v)  f (S )
3. Submodular
– Let 𝑁 be a finite set
– A set function 𝑓 is submodular if and only if

f : 2N 
S  T  N , v  N \ T ,
f ( S  v )  f ( S )  f (T  v )  f (T )

14
NP-completeness
• Bad News
– For a submodular function monotone non-negative 𝑓, finding a 𝑘-
element set 𝑆 for which 𝑓(𝑆) is maximized is an NP-hard problem.
– It is NP-hard to determine the optimum for influence maximization
for independent cascade.
• Good News
– We can use a greedy algorithm:
• Start with an empty set 𝑆
• For 𝑘 iterations:
Add node 𝑣 to 𝑆 that maximizes 𝑓(𝑆 ∪ {𝑣}) − 𝑓(𝑆).
– How good (or bad) it is?
• Theorem: the greedy algorithm provides a (1 – 1/𝑒)
approximation
• The resulting set 𝑆 activates at least (1 − 1/𝑒) > 63% of the
number of nodes that any size-𝑘 set 𝑆 could activate. 15
Greedy algorithm

16
Network game

• Choice between two things, 𝐴 and 𝐵


– e.g. basketball and soccer

• if friends choose 𝐴, they get payoff 𝑎


• if friends choose 𝐵, they get payoff 𝑏

• if one chooses 𝐴 while the other chooses 𝐵, their payoff


is 0.

17
Choose with friends

Let A = basketball, B = soccer. Which one should you play?

fraction p = 3/5 play basketball fraction p = 2/5 play soccer


18
Higher payoff selection

• 𝑑 neighbors
• 𝑝 fraction play basketball
(A)
• 1 − 𝑝 fraction play soccer
(B)

• if choose A, get payoff


𝑝𝑑𝑎
• if choose B, get payoff
1 − 𝑝 𝑑𝑏

• Decide based on
expected payoff. Choose
A if
𝑝𝑑𝑎 ≥ 1 − 𝑝 𝑑𝑏
𝑏
i.e. 𝑝 ≥ 𝑎+𝑏
19
Equilibrium

• Possible to reach equilibriums:


– Everyone chooses A; or
– Everyone chooses B.

• What if two nodes switch at random? Will a cascade occur?

• example:
– a = 3, b = 2

– payoff for nodes interaction using behavior A is 3/2


• as large as what they get if they both choose B.

– nodes will switch from B to A if at least


20
q = 2/(3+2) = 2/5 of their neighbors are playing A.
How does it spread in graphs?

• suppose 2 users start playing basketball due to external


factors
– e.g. they are bribed with a free pair of basketball shoes by a
certain company.

Now suppose you


can choose two
users (Budget B)
from the network
to start. Which two
will you choose?

21
Epidemics

• Epidemics describes the process by which diseases


spread. This process consists of
– A pathogen
• The disease being spread.
• Tweet being retweeted.
– A population of hosts
• Humans, animals, plants, etc.

• Epidemic models assume an implicit network and


unknown connections between users.

• Epidemic models are more suitable when we are


interested in global patterns,
– But not who infected whom, i.e., individualism not interesting.
22
Epidemics model: SIS

Cured
• The network consists of nodes (individual)
and edges (links between nodes). S I
• Each node is in one of two states
– Susceptible (in other words, healthy) Infected by neighbor
– Infected
• Susceptible-Infected-Susceptible (SIS)
model
– Cured nodes immediately become susceptible
and may get infected at some point.
– New infection may take place any time.

23
Birth/death rates

• Homogeneous birth rate 𝛽 on all edges between infected


and susceptible nodes.
• Homogeneous death rate 𝛿 for infected nodes.
– So “birth” is about getting infected, “death” is about getting cured!

Healthy
N2
Prob. δ
Prob. β
N1 X

Infected
N3
24
SIR vs. SIS

Cured and
• Susceptible-Infected-Removed become immune
– Sometimes called Recovered.
• Cured users become immune, so S I R
they will never be infected again.
Infected by neighbor
• Homogeneous death rate 𝛿 for
infected nodes.

25
Modeling SI
𝐼
• 𝑁: size of the crowd S I

• 𝑆(𝑡): number of susceptible individuals at time 𝑡


– rate: 𝑠(𝑡) = 𝑆(𝑡)/𝑁
• 𝐼(𝑡): number of infected individuals at time 𝑡
– rate: 𝑖(𝑡) = 𝐼(𝑡)/𝑁
• : Contact probability
– if  = 1 everyone comes to contact with everyone else.
– if  = 0 no one meets another individual.
– Real world scenario 0 < 𝛽 < 1

𝑁 = 𝑆(𝑡) + 𝐼(𝑡)
26
Modeling SI, cont’d

• At each time stamp, an infected individual will meet


𝑁 people on average and will infect 𝑆 of them
• Since 𝐼 are infected, 𝐼𝑆 will be infected in the next time
step (so assuming no overlaps)

𝐼0 : initial
number of
infected.

(𝑆 + 𝐼 = 𝑁)

27
SI graph

Logistic growth function compared to the HIV/AIDS growth in the United States

28
Modeling SIR
𝐼 𝛾𝐼
• 𝑁: size of the crowd S I R

• 𝑆(𝑡): number of susceptible


individuals at time 𝑡
– 𝑠(𝑡) = 𝑆(𝑡)/𝑁
• 𝐼(𝑡): number of infected
individuals at time 𝑡
– 𝑖(𝑡) = 𝐼(𝑡)/𝑁
• R(𝑡): number of recovered
individuals at time 𝑡
– 𝑟(𝑡) = 𝑅(𝑡)/𝑁

𝐼(𝑡) + 𝑆(𝑡) + 𝑅(𝑡) = 𝑁 𝛾 : recovering


probability of an
infected.
29
Modeling SIR, cont’d

(𝑅0 = 0)

30
SIR graph

S R

SIR model simulated with S0 = 99, I0 = 1, R0 = 0,  = 0.01, and  = 0.1

31
Modeling SIS
𝛾𝐼

𝐼
S I
• The SIS model is the same as the SI
model with the addition of infected nodes
recovering and becoming susceptible
again.

32
Modeling SIS, cont’d

𝛾
Case 1: When 𝑵 ≤ 𝜸 (or when 𝑁 ≤ ):

– The first term will be at most zero or negative
– The whole term becomes negative
– In the limit, I(t) will decrease exponentially to zero

𝛾
Case 2: When 𝑵 > 𝜸 (or when 𝑁 > ):

– We will have a logistic growth function like the SI model

33
SIS graph

S I

34
SIS model simulated with S0 = 99, I0 = 1,  = 0.01, and  = 0.1
Modeling SIRS
𝜆𝑅
• The individuals who have recovered
will lose immunity after a certain 𝐼 𝛾𝐼
S I R
period of time and will become
susceptible again.

35
SIRS graph

S
R
I

SIRS model simulated with S0 = 99, I0 = 1, R0 = 1,  = 0.01, 𝜆 = 0.02, and  = 0.1 36


Furthering epidemics analysis

• Many things to study:


– How fast will users be infected?
– At what point will a certain percentage of users be infected?
– How does a node’s position in the graph change its infection?

• Epidemic analysis can also be used to study information or


opinion spread.

37

You might also like