Ometric

Fast Graph Representation
Learning with PyTorch

Geometric
GNN reading group
05/09/2019
Thibault Formal
Motivation
Up to now, we have been dealing with models, but not on how to implement / train
them
Graph training: slight differences w.r.t. “standard” training (what is an example,

what is a batch, relations between data points etc.)
Not necessarily difficult (see PyTorch implementation of GCN on Kipf’s github:

https://github.com/tkipf/pygcn)
… but not necessarily optimized
Need of unified, simple and efficient interface to quickly experiment

High level overview
Fast Graph Representation Learning with PyTorch Geometric, Matthias Fey and
Jan E. Lenssen, 2019 (ICLR workshop)
PyTorch Geometric (PyG) is a library for deep learning on irregular structured input
data such as graphs, point clouds and manifolds, built upon PyTorch
github: https://github.com/rusty1s/pytorch_geometric
doc: https://rusty1s.github.io/pytorch_geometric/build/html/index.html
Main contributions (overview)
● Large number of common benchmark datasets (CORA, CiteSeer etc.)
● Easy to use dataset interface (for custom ones)
● Helpful data transforms
● Mini-batch handling (graphs different sizes)
● Clean message passing API
● High data throughput
● Bundles many recently proposed GCN-like layers
● …
Many already implemented operators/models (> 25)
● Semi-Supervised Classification with Graph Convolutional Networks, Kipf and

Welling, ICLR 2017 (Julien’s presentation) → GCNConv
● Modeling Relational Data with Graph Convolutional Networks, Schlichtkrull et

al., ESWC 2018, (Yagmur’s presentation) → RGCNConv
● Simplifying Graph Convolutional Networks, Wu et al., CoRR 2019 (Noé’s

presentation) → SGConv
● How Powerful are Graph Neural Networks ?, Xu et al., ICLR 2019 (Rohit’s
presentation) → GINConv
● [...]
Installation
Quick note on installation
It took me one day to set up everything on my laptop + servers
Installation is well documented but few things are not clear (at least to me)
My recommendation (additional notes w.r.t. doc)
1. setup a new conda environnement
2. conda install -c psi4 gcc-5 →it ensures valid gcc version (see:
https://github.com/rusty1s/pytorch_geometric/issues/170) (thanks Rohit)
3. Check if CUDA is installed (on my machine located in /usr/local/cuda-10.1/, on
the servers in /nfs/core/cuda/*)1
4. then install PyTorch (1.1, choose right CUDA):
https://pytorch.org/get-started/locally/
5. Then strictly follow installation guide:
https://rusty1s.github.io/pytorch_geometric/build/html/notes/installation.html
6. This should work !
1
: On my laptop I installed it following: https://www.if-not-true-then-false.com/2018/install-nvidia-cuda-toolkit-on-fedora/
A note on design choice
SpMM vs gather + scatter
Generally, you can express message passing as:
with A sparse adjacency matrix
Then you can implement model with SpMM (what is done in Kip’s GCN)
But in PyG, there is no single place where SpMM is used !
Gather/Scatter scheme (with dedicated CUDA kernels)

Scatter operation in a nutshell: add
0 0 1 2 1 0 3 2 index
5 2 3 1 1 2 7 2 source
9 4 3 7
Efficiency: faster than SpMM (for not too dense settings)
Flexibility: e.g. easy integration of edge features

Data
Data handling of graph
A single graph is described by an instance of torch_geometric.data.Data, which
holds some default attributes:
● data.x: node features, shape (num_nodes, num_features)

● data.edge_index: adjacency matrix in COO format, shape (2, num_edges)
● data.y: labels (any shape)
● etc.
None of these are required, and one can easily add new attributes by extending
the class
Example
Dataset and DataLoader
Then similarly to PyTorch, you can define a torch_geometric.data.Dataset (that
may contain several graphs) and a torch_geometric.data.DataLoader (to iterate
over a dataset)
Large number of benchmark datasets + clean interface to build you own + useful
data transforms
Batch graphs with different number of nodes / edges ? → Build a large sparse
block diagonal adjacency matrix + concatenate features and target matrices (no
messages exchanged between these disconnected graphs)
The Message Passing interface
Message Passing networks
Neighborhood aggregation or message passing scheme ([2] Gilmer et al., 2017)
update function message function
● AGG: differentiabe, permutation invariant function, e.g. sum, mean or max

● ɸ, Ɣ: differentiable functions, e.g. MLPs
[2] Gilmer et al., Neural message passing for quantum chemistry, 2017 (ICLR)
The MessagePassing base class
PyG provides the torch_geometric.nn.MessagePassing base, which helps
creating such networks, by automatically taking care of message propagation
You need to extend this class and define:
● message function
● update function
● aggregation scheme (add, mean or max)
Under the hood (high level view)
Toy example
[0, 0, 0, 1, 1, 2, 3, 3]
A(COO) =
[1, 2, 3, 0, 3, 0, 0, 1]
Toy example - simple node update
source nodes i,
which aggregate
[0, 0, 0, 1, 1, 2, 3, 3]
A = [1, 2, 3, 0, 3, 0, 0, 1] target nodes j, to
messages aggregate
messages = [ɸ(x1(k-1)), ɸ(x2(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x0(k-1)), ɸ(x1(k-1))]

using target index
aggregation
(scatter op)
e.g. sum
using source index
Few lines of code
Should you use PyG ?
Contender: Deep Graph Library (DGL)
Both accepted at ICLR 2019 RLGM workshop (this week !)
At the time of writing, PyG >> DGL (much faster, up to 15 times faster !)
But…
Fused message passing
Blog post: https://www.dgl.ai/blog/2019/05/04/kernel.html
Standard message passing does not scale to large graphs: messages are
explicitly materialized and stored
GCN on GraphSAGE’s dataset (232K nodes and 114M edges) ?
They introduce fused message passing == no explicit messages
Perf is similar for small graphs

Thank you !

Ometric

Uploaded by

Copyright:

Available Formats

Ometric

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ometric

Uploaded by

Copyright:

Available Formats

Fast Graph Representation

Learning with PyTorch

Graph training: slight differences w.r.t. “standard” training (what is an example,

Not necessarily difficult (see PyTorch implementation of GCN on Kipf’s github:

… but not necessarily optimized

Need of unified, simple and efficient interface to quickly experiment

● Semi-Supervised Classification with Graph Convolutional Networks, Kipf and

● Modeling Relational Data with Graph Convolutional Networks, Schlichtkrull et

● Simplifying Graph Convolutional Networks, Wu et al., CoRR 2019 (Noé’s

with A sparse adjacency matrix

But in PyG, there is no single place where SpMM is used !

Gather/Scatter scheme (with dedicated CUDA kernels)

Efficiency: faster than SpMM (for not too dense settings)

Flexibility: e.g. easy integration of edge features

● data.x: node features, shape (num_nodes, num_features)

update function message function

● AGG: differentiabe, permutation invariant function, e.g. sum, mean or max

You need to extend this class and define:

messages = [ɸ(x1(k-1)), ɸ(x2(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x0(k-1)), ɸ(x1(k-1))]

GCN on GraphSAGE’s dataset (232K nodes and 114M edges) ?

They introduce fused message passing == no explicit messages

Perf is similar for small graphs

You might also like