Mixed Membership Stochastic Blockmodels: (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Mixed Membership Stochastic Blockmodels

(2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg


and Eric P. Xing

Herrissa Lamothe

Princeton University

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 1 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 2 / 28


Motivating question

Are there certain ”rules” dictating how individual vertices (or nodes) make
”decisions” to connect/not to connect to other vertices?

Let’s suppose we can draw a network graph, G from a generative


model, so that G comes from a probability distribution Pr (G |θ),
governed by parameter θ
so that if θ is partly known, it can act as a constraint in generating
synthetic graphs (similar to G )
if G is partly known, we can use it to infer a θ that make G likely

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 3 / 28


How do we obtain θ?

Y
Pr (G |θ) = Pr (Aij |θ)
ij

Vertex-level attributes: too chaotic (over-fit)


Global-network attributes: too general (over-generalize)

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 4 / 28


Introducing structure

Differentiates data from noise


Captures relevant patterns
Describing patterns and predicting them

Vertices → Communities

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 5 / 28


Let’s Rephrase the Motivating Question

How do groups of vertices make decisions to connect/not to connect to


other groups?

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 6 / 28


The stochastic block model
Class of models of which Mixed Membership Stochastic Blockmodel is a
variant.

Aaron Clauset (UColorado at Boulder, Computer Science)

has amazing slides...


Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 7 / 28
the stochastic block model

• each vertex i has type zi 2 {1, . . . , k} (k vertex types or groups)


• stochastic block matrix M of group-level connection probabilities
• probability that i, j are connected = Mzi ,zj

community = vertices with same pattern of inter-community connections

inferred M
C
C

D
s
ore

tors
sc

s
ent

ion
dic
gnm

9
reg
t in

ion
the stochastic block model

assortative disassortative ordered core-periphery


edges within groups edges between groups linear group hierarchy dense core, sparse periphery
the stochastic block model

the most general SBM


Y
Pr(A | z, ✓) = f Aij | ✓R(zi ,zj )
i,j

Aij : value of adjacency Binomial = simple graphs


Poisson = multi-graphs
R : partition of adjacencies Normal = weighted graphs
etc.
f : probability function
✓a,⇤ : pattern for a -type adjacencies
✓11 ✓12 ✓13 ✓14
✓21 ✓22 ✓23 ✓24
✓31 ✓32 ✓33 ✓34
✓41 ✓42 ✓43 ✓44
Interactions across blocks

Classes of SBMs are looking at interactions across ”blocks”


They are not directly looking for patches of ”connectedness” among
nodes.
Assume that individual nodes’ behavior can be explained entirely by
group membership.

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 8 / 28


Sociological contributions

If this way of operationalizing the problem (in terms of group membership)


seems familiar, that is because sociologists have made many contributions
to this line of research.

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 9 / 28


Implications for interactions within blocks

Given this framework, what are we implying about connections between


vertices i and j if they belong to the same group k?

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 10 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 11 / 28


Mixed Membership

Mixed membership stochastic block model (MMSB) (f = Bernoulli)

Similar to SBM, but with an extra layer of parameters to estimate.


Key assumptions remain: Pr (i → j) = Mzi,zj
M = Stochastic Block Matrix
But, zi and zj must be estimated for each dyadic interaction between all
i and j vertices, based on a latent mixed membership vector for each i.

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 12 / 28


Mixed Membership

each vertex i has a mixed membership vector θi ∼ Dirichlet(α)


vertex i takes on a single group membership with probability θi in the
context of a directed dyadic interaction with vertex j

For each pair of vertices (i, j) ∈ [1, N]x[1, N] (adjacency matrix),

Sample group membership of i and j independently

Sample group zi→j ∼ Multinomial(θi , 1)

Sample group zi←j ∼ Multinomial(θi , 1)

Note: This model specification can be adapted to undirected interactions


easily, zi↔j

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 13 / 28


Mixed Membership

Key Innovations:
nodes belong to more than one group
nodes belong to groups with different strengths of membership
nodes take on a specific group membership for the duration of an
interaction
include sparsity parameter, control model’s sensitivity to zeros in
adjacency matrix due to noise.

Final sampling of Aij ∼ Bernoulli (ρ zi→j M zi←j + (1 − ρ) δ0 )

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 14 / 28


Mixed Membership

joint probability distribution:

N
Y N
Y
p(A, θ, zi , zj |α, M) = p(θi |α) p(zi→j |θi )p(zi←j |θj )p(Aij |zi→j , zi←j , M)
i=1 j=1

Where,
A is the observed adjacency matrix
M is the block matrix
θi and θj are the mixed membership vectors for i and j
zi and zj are the group membership indicators for i and j during their
interaction

The only input to this model is the number of groups.

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 15 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 16 / 28


MMSB Estimation
We won’t focus on this too much...
Only to note that the actual marginal probability (likelihood) of
p(A|α, M) is not tractable to compute (i.e. we cannot integrate
out z and α).
Airoldi et al. carry out an approximate inference and parameter
estimation.

In order to compute the posterior degrees of membership for all i given


hyperparameters (θ and α):

p(θ, A|α, M)
p(θ|A, α, M) =
p(A|α, M)

They use variational methods: ”find a lower bound of the likelihood and
approximate posterior distributions for each objects membership vector.”
(Airoldi et al, 2015 p. 7)
Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 17 / 28
MMSB Estimation

What are the implications of this estimation strategy for the model?

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 18 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 19 / 28


Sampson monk factions

(b) Airoldi et al.: Variational Methods


(a) MMSB with LDA

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 20 / 28


Synthetic data

How to create ”good” synthetic data?


various levels of difficulty of detection? (cin − cout )
specific block patterns?
always the problem of linking recovered partitions to actual theorized
groups

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 21 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 22 / 28


Potential Issues

1 Estimation procedure
2 Selecting ”K” number of groups

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 23 / 28


Selecting K

Two Goals: Prediction and Interpretation

1 Prediction: We want to identify the number of group affiliations, K ,


most predictive of observed patterns of interaction, G .

2 Interpretation (Hypothesis-Driven): We want to determine


how predictive a specific set of hypothesized group affiliations, K , is
of observed patterns of interaction, G .

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 24 / 28


Selecting K

Our goal affects how we deal with two key issues


in community detection:

1 How do we determine the right number of groups in analyzing the


interactions in G ?
prediction: Find the most predictive K (i.e. BIC,
or cross-validation).
interpretation: ?

2 How do we know that the way our algorithm (”heuristic”) partitioned


the data is always the same for K groups?
prediction: permutation of components of θi to
interpret [E |θi (k)|A], if we know components of each
functional group.
interpretation: ?

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 25 / 28


Outline

1 Overview

2 The MMSB Model


Mixed Membership
Model Estimation

3 Application of Mixed Membership Model


Empirical and Synthetic Data
Drawbacks to the MMSB
Model Flexibility

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 26 / 28


There are also times when we, as researchers, have prior (if partial)
information that we want to test and inject in our model...
partial information on K
partial information on M (block, or ”mixing” matrix)
partial information on θ (mixed membership vectors
partial information on decision rule for selecting zi→j or zi←j . (i.e.
we might want to break the independence assumption).
or any combination of the above

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 27 / 28


Questions to the room

Are there tools available to allow researchers to take advantage of


this potential flexibility in the MMSB and SBMs in general?
What types of research question can we address using the MMSB
model?
What types of research question can we address if we can take full
advantage of the model’s potential flexibility?

and as always, there are questions about the assumptions made by this
model.
Under what circumstances (for what research questions) are we
willing to make them?
Under what circumstances would these assumptions not hold?

Herrissa Lamothe (Princeton University) Mixed Membership Stochastic Blockmodels 28 / 28

You might also like