Ometric

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Fast Graph Representation

Learning with PyTorch


Geometric
GNN reading group
05/09/2019

Thibault Formal
Motivation
Up to now, we have been dealing with models, but not on how to implement / train
them

Graph training: slight differences w.r.t. “standard” training (what is an example,


what is a batch, relations between data points etc.)

Not necessarily difficult (see PyTorch implementation of GCN on Kipf’s github:


https://github.com/tkipf/pygcn)

… but not necessarily optimized

Need of unified, simple and efficient interface to quickly experiment


High level overview
Fast Graph Representation Learning with PyTorch Geometric, Matthias Fey and
Jan E. Lenssen, 2019 (ICLR workshop)

PyTorch Geometric (PyG) is a library for deep learning on irregular structured input
data such as graphs, point clouds and manifolds, built upon PyTorch

github: https://github.com/rusty1s/pytorch_geometric

doc: https://rusty1s.github.io/pytorch_geometric/build/html/index.html
Main contributions (overview)
● Large number of common benchmark datasets (CORA, CiteSeer etc.)
● Easy to use dataset interface (for custom ones)
● Helpful data transforms
● Mini-batch handling (graphs different sizes)
● Clean message passing API
● High data throughput
● Bundles many recently proposed GCN-like layers
● …
Many already implemented operators/models (> 25)

● Semi-Supervised Classification with Graph Convolutional Networks, Kipf and


Welling, ICLR 2017 (Julien’s presentation) → GCNConv

● Modeling Relational Data with Graph Convolutional Networks, Schlichtkrull et


al., ESWC 2018, (Yagmur’s presentation) → RGCNConv

● Simplifying Graph Convolutional Networks, Wu et al., CoRR 2019 (Noé’s


presentation) → SGConv

● How Powerful are Graph Neural Networks ?, Xu et al., ICLR 2019 (Rohit’s
presentation) → GINConv

● [...]
Installation
Quick note on installation
It took me one day to set up everything on my laptop + servers

Installation is well documented but few things are not clear (at least to me)
My recommendation (additional notes w.r.t. doc)
1. setup a new conda environnement
2. conda install -c psi4 gcc-5 →it ensures valid gcc version (see:
https://github.com/rusty1s/pytorch_geometric/issues/170) (thanks Rohit)
3. Check if CUDA is installed (on my machine located in /usr/local/cuda-10.1/, on
the servers in /nfs/core/cuda/*)1
4. then install PyTorch (1.1, choose right CUDA):
https://pytorch.org/get-started/locally/
5. Then strictly follow installation guide:
https://rusty1s.github.io/pytorch_geometric/build/html/notes/installation.html
6. This should work !
1
: On my laptop I installed it following: https://www.if-not-true-then-false.com/2018/install-nvidia-cuda-toolkit-on-fedora/
A note on design choice
SpMM vs gather + scatter
Generally, you can express message passing as:

with A sparse adjacency matrix

Then you can implement model with SpMM (what is done in Kip’s GCN)

But in PyG, there is no single place where SpMM is used !

Gather/Scatter scheme (with dedicated CUDA kernels)


Scatter operation in a nutshell: add
0 0 1 2 1 0 3 2 index

5 2 3 1 1 2 7 2 source

9 4 3 7

Efficiency: faster than SpMM (for not too dense settings)

Flexibility: e.g. easy integration of edge features


Data
Data handling of graph
A single graph is described by an instance of torch_geometric.data.Data, which
holds some default attributes:

● data.x: node features, shape (num_nodes, num_features)


● data.edge_index: adjacency matrix in COO format, shape (2, num_edges)
● data.y: labels (any shape)
● etc.

None of these are required, and one can easily add new attributes by extending
the class
Example
Dataset and DataLoader
Then similarly to PyTorch, you can define a torch_geometric.data.Dataset (that
may contain several graphs) and a torch_geometric.data.DataLoader (to iterate
over a dataset)

Large number of benchmark datasets + clean interface to build you own + useful
data transforms

Batch graphs with different number of nodes / edges ? → Build a large sparse
block diagonal adjacency matrix + concatenate features and target matrices (no
messages exchanged between these disconnected graphs)
The Message Passing interface
Message Passing networks
Neighborhood aggregation or message passing scheme ([2] Gilmer et al., 2017)

update function message function

● AGG: differentiabe, permutation invariant function, e.g. sum, mean or max


● ɸ, Ɣ: differentiable functions, e.g. MLPs

[2] Gilmer et al., Neural message passing for quantum chemistry, 2017 (ICLR)
The MessagePassing base class
PyG provides the torch_geometric.nn.MessagePassing base, which helps
creating such networks, by automatically taking care of message propagation

You need to extend this class and define:

● message function
● update function
● aggregation scheme (add, mean or max)
Under the hood (high level view)
Toy example

[0, 0, 0, 1, 1, 2, 3, 3]
A(COO) =
[1, 2, 3, 0, 3, 0, 0, 1]
Toy example - simple node update
source nodes i,
which aggregate
[0, 0, 0, 1, 1, 2, 3, 3]
A = [1, 2, 3, 0, 3, 0, 0, 1] target nodes j, to
messages aggregate

messages = [ɸ(x1(k-1)), ɸ(x2(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x3(k-1)), ɸ(x0(k-1)), ɸ(x0(k-1)), ɸ(x1(k-1))]


using target index

aggregation
(scatter op)
e.g. sum
using source index
Few lines of code
Should you use PyG ?
Contender: Deep Graph Library (DGL)
Both accepted at ICLR 2019 RLGM workshop (this week !)

At the time of writing, PyG >> DGL (much faster, up to 15 times faster !)

But…
Fused message passing
Blog post: https://www.dgl.ai/blog/2019/05/04/kernel.html

Standard message passing does not scale to large graphs: messages are
explicitly materialized and stored

GCN on GraphSAGE’s dataset (232K nodes and 114M edges) ?

They introduce fused message passing == no explicit messages

Perf is similar for small graphs


Thank you !

You might also like