Ishigurognnintroduction201023 201027054344

Modified from the course material of:
Nara Institute of Science and Technology

Data Science Special Lecture
An Introduction to Graph Neural Networks:

basics and applications
Katsuhiko ISHIGURO, Ph. D (Preferred Networks, Inc.)

Oct. 23, 2020
1
Take home message
• Graph Neural Networks (GNNs): Neural Networks (NNs) to
compute nodes’ representation of graph-structured data
• Practical applications in industry, hard competitions in academia
• Model: The fundamental is approximated Graph Convolution
• Applications: applicable in several tasks in different domains
2
Table of Contents (1/2)
• Graph Neural Networks (GNN)s; what is it

– Graph structure data
– Overview of GNN: function, application
• The basic and the de-facto standard: GCN
– Graph Convolutions and the GCN model
– Exemplar task: semi-supervised node classification with GCN
3
Table of Contents (2/2)
• Application in Chemo-Informatics: Protein Interface Prediction

– A straightforward application of GNN as I/O for graphs
• Application in Computer Vision: Scene Graph generation
– Task-tailored GNN model
• Theoretical issues of GNN
– Deep GNN does not work well
– Representation power of GNN is upper-bounded
• Conclusion and Materials
4
Acknowledgements
Many thanks for helpful comments and provided

materials!!
Daisuke Okanohara, Shin-ichi Maeda, Kentaro Minami,

Yuta Tsuboi, Sousuke Kobayashi, Kenta Oono, Hiroshi
Maruyama (Preferred Networks)
5
Graph: for relational data
• Graph
• Set of nodes (vertices): ：
• Set of edges:
– directional/non-directional
• Especially interested in Attributed graph

– Nodes and/or edges have some features (label,
numbers, vectors) 6
Matrix representation of Attributed graph
• Suitable for formulations/implementations of GNNs

Feature Matrix
N nodes, D dim. features
Adjacency Matrix 0 1 1 0 0
1 0 0 1 1
1 0 0 0 1
Edge existence between
N X N node pairs 0 1 0 0 0
0 1 1 0 0
Symmetric
7 A <-- non-directional edges
GNN Functionality: compute latent representation
H
Latent repr. Matrix H of nodes
Node
Prediction
tasks
L-layered
GNN
0 1 1 0 0
1 0 0 1 1
1 0 0 0 1 Latent repr. Vector h_G of a graph
0 1 0 0 0 Graph
0 1 1 0 0 Prediction
tasks
8
Interface for intelligent processes on graphs
• H is informative and easy-to-handle “data” for machine-

learning (ML) models/tools
Classification
L-layered Regression
GNN
Interpretation
Informative features
ML-friendly structure (set of vectors)
Generative
9 models
Application Domains of GNNs: concrete
applications
Knowledge Graph
Social Network Chemical Molecular Graph [DeCao19QA2]
[Jin18JTVAE, Laugier18CGCNN]
[http://konect.uni-
koblenz.de/networks/ucidata-
zachary]
Scene Graph Pointcloud [Landrieu_Boussaha19PCS]

[Qi19AttentiveRN]
Physical Simulator [Sanchez-Gonzalez20GNS]
10
GNN in Machine Learning Researches
~2016: “graphical models”, “graph kernels”, and pure graph theory works
2017~: GNN works add up the counts
Numbers of accepted papers: “title: Graph”

90
80
70
60
50
40
30
20
10
0
ICML ICLR NeurIPS
2014 2015 2016 2017 2018 2019 2020 [by the lecturer]
11
The basic and the de-
facto standard: GCN
Convolution
• Modify a signal f by a filter g, accumulate over ``shift’’ t

on the coordinate (axes) x
1D signal example f g f*g
f: Sound signal
g: Impulse response
 Hall resonated sound
(reverb) abcpedia.acoustics.jp
13
E.g. ConvNet in computer vision [Krizhevsky12ConvNet]
2012 ILSVRC competition
Former (Non-DNN) top runners:

> 1% difference
CnvNet (Alexnet)
[Krizhevsky12ConvNet]
10% margin
14 [http://image-net.org/challenges/LSVRC/2012/ilsvrc2012.pdf]
E.g. ConvNet in computer vision [Krizhevsky12ConvNet]
• L- iterative conv. Operations for image pixels

pixel (x, y) Image signal (feature) at pixel (x,y) of layer l
Layer l
Layer l-1
shift signal filter
[Kipf18talk]
15
How can we define Conv. in graphs?
• No “axes” in graphs unlike sound signals (time) and

image signals (x, y)
• What is “coordinate”? What is “shift”?
Spectral approach on Graphs
16 [いらすとや]
Fourier Transform and Convolution
Convolutions in the signal (spatial) domain = Multiplication
in the spectral (frequency) domain
1D case:
K-th frequency’sFourier base function
For Graphs, what is the signal? How the FT defined??

17
Def: Graph signals
• Simply concatenate numbers

• 1D node attribute case: N-array
• D-dim attribute case  Feature matrix
[Ma20tutorial]
18
Def: Graph Laplacian and Normalized Laplacian
2 -1 -1 0 0
Graph Laplacian Matrix -1 3 0 -1 -1
-1 0 2 0 -1
0 -1 0 1 0
0 1 1 0 0
Adjacency Matrix 0 -1 -1 0 2
1 0 0 1 1
1 0 0 0 1
Normalized Graph Laplacian
0 1 0 0 0
0 1 1 0 0 1 0 0
2 0 0 0 0 Degree Matrix 1 0
0 3 0 0 0
0 1 0
0 0 2 0 0
0 0 0 1 0 0 -1 0 1 0
0 0 0 0 2
19 0 0 1
Graph Fourier Transform =
Multiply Eigenvectors of (Normalized) Graph
Laplacians
i-th eigenvevtor i-th eigenvalue
= i-th graph Fourier base = frequency of i-th base
Graph Fourier Transform(GFT)
Inner product between the graph

Graph signal in spectral (freq.)
signal (x) and Fourier bases (U)
space
20
Graph Fourier Transform: just multiply U^T or U
GFT
Spatial domain: x IGFT Spectral domain:𝒙
[Ma20tutorial,Shuman13Graph]
21
Graph Convolution via GFT
• Convolution is a multiplication in Fourier space

and GFT as well
GFT of the graph signal
Convolution = simple
multiplication in spectral
IGFT to get conv-filtered graph signal
22
Naïve Graph Convolution
Two parameters: Fixed (eigenvectors of the Laplacian = Fourier bases over freq.)
Trainable
(amplify each spectral coefficients)
 Simple linear algebra for a generic Graph Conv.

 Eig.Decompose of (normalized) graph Laplacians O(N^3) for each graph
23
Approximation: ChebyNet [Defferrard16ChebNet]
• Model N trainable param. with K (<<N) order

Chebychev Polynomials
Chebychev
Polynomial
24
No U-s = No Eig. Decomp.
 Graph Laplacian is sparse in nature

 Polynomial L is lightweight << O(N^3)
25
GCN [Kipf_Welling17GCN]: much simpler Convolution
Set K = 1, and replace l_N with 2 (the theoretical maximum value)
Very simple GraphConv

• No Eig. Decomp.
• No Matrix Polynomials
Assume w = -w1 = (w0-w1) • single tunable parameter (w)
26
spatial interpretation of GCN’s convolution
• Weighted sum of neighbor nodes (shift) + self-link

• Number of neighbor nodes vary unlike image Cnvnet
Self-link Neighboring nodes
27
GCN formulation: multi-dim simpler conv +
activation
Non-linear activation (e.g. sigmoid, ReLU)
Update rule of GCN -th layer
output: updated node vectors Input: node vectors @ layer
Trainable weight parameter

for node vector updates
Fixed Adjacency parameter

for node conv (mixing)
28
Once trained, run anywhere (on different
topologies)
• The sole trainable parameter W does not care the
adjacency
Applicable for
Trained GCN
Different Graphs
GCN
いらすとや
https://www.irasutoya.com 29
The simplest GCN… it works quite well
Perform better than

[Kipf_Welling17GCN] the original ChebNet
30
Exemplar task: semi-supervised node
classification with GCN
Number of Nodes: N (many) Set of labeled nodes (supervisions)
Set of unlabeled nodes
Goal: predict the labels of unlabeled nodes yj

using a trained model (GCN and classifier)
Train: tune parameters of the model to recover yi

? ? 31
Forward path
Apply L-layered GCN to update the latent node representations, H
The initial node vector

= node attributes
Classifier computes the label prob. distribution based on the final H
C={ , } q: trainable param of

a classifier
32
Objective Function and Training
``Labeled’’ Objective function: cross entropy
≒ accuracy of label prediction on Labeled Nodes
Training: find the best parameters W(GCN) and q(classifier)

to minimize the negative of the objective function L
33
Prediction using the Trained Model
``unlabeled’’ Predict yj (a label of node j) in unlabeled set,
perform forward path from the observed xj
34
Application in Chemo-
Informatics: Protein
Interface Prediction
35
Interacting proteins
• Where is a binding site between two proteins?

– Biding site: where a ligand and a receptor interacts
Binding sites:
Ligand interface for connection
Drug A
human organ, cell, ….
Drug A’
Receptor
36
Predict binding site nodes by GNN [Fout17Interface]
• Predict the binding site regions of ligand-receptor

proteins
– Possible binding = possible drug discovery and
improvement
• Protein: a graph of inter-connected amino acids

• Use GNN for the binding prediction
37
[Sanyal20PGCN]
Attributed Protein Graphs

Node (residue): a sub-structure
consists of amino acids
Ligand protein graph

Receptor protein graph
Binding sites: specific nodes interact between ligand and receptor

38
Training/Test of Interaction Prediction [Fout17Interface]
• Training: optimize parameters to predict the interaction

label between a node pair ( , ) (for all training
sampls)
minimize
• Test: predict interactions of node pairs in unseen

Ligand-Receptor pairs
– “Train once, run any graphs (with different topology)”
39
architecture
Classify all pairs of node latent vectors in one-shot
[Fout17Interface]
40
GCN formulation choices [Fout17Interface]
Considering nodes
features only
Node features plus

Edge features
With ordering of
nearest node orders
Neighbor nodes j are ordered by distances from the node i

41
Results
[Fout17Interface]
42
[Fout17Interface]
43
Computer Vision
Application Example:
Scene Graph generation
Scene Graph: summary graph of image contents
• Useful representation of images’ understanding

• Want to generate a scene graph from image input
In a scene graph of an image,

node: object region in an image
edge: relation between object regions
[Qi19AttentiveRN]
45
Graph R-CNN for scene-graph generation
[Yang18GRCNN]
• Consists of three inner components
– (a)  (b) Off-the-shelf object detector [Ren15RCNN] (omit)
– (b)  (c) RePN to prune unnecessary branches (omit)
– (c)  (d) Attentional GCN to predict labels of nodes and edges
46 [Yang18GRCNN]
One GNN can infer
Expand Graph with Relation Nodes all objects and relations
Graph
Expansion
Objects detected
[Yang18GRCNN]
Original input graph

(sparsified by RePN):
node = object
Object node Relation node
edge = relation
Obj-Obj edge Obj-Rel node
47
Attentional GCN [Yang18GRCNN]
• GCN update with Attention [Bahdanau15Attention] -based variable
connection
Object node’s
GCN
latent vector
Adjacency is fixed
Attentional GCN
Attention conncection
Relation node’s
latent vector
Parameter f can switch based on node types
48
Inferring labels of objects/relations
Expanded Graph Ground Truth Scene Graph
Object node’s latent vector
Relation node’s latent vector
49 [Yang18GRCNN]
predicted predicted
predicted
predicted
predicted predicted
50
[Yang18GRCNN]
Other Application of GNNs
• Segmentation of pointcloud w/ 10^5 samples [Hu20RandLaNet]

– Decode laser rangefinder data into 3D obstacle map
• Particle-based Physical simulator [Sanchez-Gonzalez20GNS]
– Simulate fluid dynamics via GNN
• Molecular Graph Generation [Jin18JTVAE, Madhawa19GVNP]
– Generate a novel compound graph in computers
51
Theoretical issues
52
Theoretical Topics of GNN
• “Deep” GNNs do no work well

– “Oversmoothing”
– Current solution: normalizer + residual connection
• The theoretical limit of representation powers of GNNs

– Graph Isomorphism test
– Invariance/Equivalence
53
``Deep” GNNs do not work well
Quite different from the successful deep Conv models in Computer vision
Better
[Kipf_Welling17GCN]
54
Oversmoothing Problem [Li18Oversmoothing]
• Latent node vectors get closer to each other as the
GCN layers go deeper (higher)
• Difficult to distinguish nodes in deeper GCNs
Latent node vectors of ``Karate club’’ social network [Li18Oversmoothing]
55
Oversmoothing is theoretically proven
• Deeper GCN converges to a solution where
connected nodes will have similar latent vectors
[Li18Oversmoothing, NT_Maehara19revisit]
• Such convergence in GCN proceeds very quickly

(exponential to the depth), regardless of the initial
node vectors [Oono_Suzuki20Exponential]
– Similar conclusion also holds for a generic GNN
56
A good workaround [Zhou20Deep, Chen20SimpleDeep, Li20DeeperGCN]
• Combining a proper normalizer and a residual term

– Normalizing the latent node vectors keep distant
from each other [Zhao_Akoglu20PairNorm]
– Residual terms keep the norm of loss gradients in
a moderate scale [Kipf_Welling17GCN]
Residual term:
Add the current layer ‘’as it is’’
Not a workaround for “deeper layers, stronger

57 GNNs” (like image recognition)
The theoretical limit of representation powers of
GNNs
• Surprisingly, the limitation of GNNs is already known in
the problem of “graph isomorphism” test
• The theory does NOT directly give limitations of GNN’s

power for other tasks (i.e. node classification), but it is
loosely related [Chen19RGNN]
58
Graph isomorphism problem (for theory of GNNs)
• Classify whether two given graphs have an edge-

preserving bijective map of node sets
– the same topology in terms of the edges?
The same
graph?
[ https://ja.wikipedia.org/wiki/グラフ同型 ]
59
Weisfeiler-Lehman (WL) test algorithm [Weisfeiler_Lehman68WL]
• A popular heuristics to test the isomorphism

• Idea: concatenate the neighbour nodes’ labels to check
the edge topology
– Used in graph kernels [Shervashidze11WLKernel, Togninalli18WWL] and
GNNs [Jin17WLorg, Morris19kGNN]
60
Weisfeiler-Lehman (WL) test algorithm [Weisfeiler_Lehman68WL]
[Shervashidze11WLKernel]
61
Upper limit of the GNN’s representation power
• In terms of the Graph Isomorphism problem,

a generic GNN ≦ WL test algorithm [Xu19GIN, Morris19WL]
– There could be a case where WL test can decide
isomorphism but GNN can not.
62
Graph Isomorphism Network (GIN) [Xu19GIN]
• Proposed a specific GNN architecture that attain the

same graph isomorphism detection power
One non-linear activation function
A layer update of
typical GNNs
A layer update of
the proposed GIN
Each layer update must be as powerful as MLP

63
Higher-order WL/GNN
• k-dimensional WL (k-WL) test

– labels the k-tuple of nodes
– powerful than the originalWL
– K-GNN [Morris19kGNN] is as good
as k-WL test
[Maron19NeurIPS]
64
More powerful GNNs [Sato20Survey]
• K-order graph network

– consists of all linear functions that are invariant or
equivariant to node permutations [Maron19ICLR]
– as powerful as k-GNN, memory-efficient [Maron19NeurIPS]
• CPNGNN introduces the local node ordering, and is
strictly powerful than GIN [Sato19CPNGNN]
65
Conclusion and
Materials
66
Take home message (Revisited)
• Graph Neural Networks (GNNs): Neural Networks (NNs) to
compute nodes’ representation of graph-structured data
• Practical applications in industry, hard competitions in academia
• Model: The fundamental is approximated Graph Convolution
• Applications: applicable in several tasks in different domains
67
68
Surveys and documents for studying GNN
• Survey papers on GNN

– [Wu19survey] [Zhou18survey]
– For experts: [Battaglia18survey]
• Tutorials, slideshare, blogs

– English: [Kipf18talk], [Ma20AAAI]
– 日本語: [Honda19GNN]
69
The most famous dataset resources
• Prof. Leskovec (stanford)

– Stanford Large Network Dataset collections
http://snap.stanford.edu/data/index.html
• UC Santa Cruz LINQS group

– https://linqs.soe.ucsc.edu/data
– train/valid/test split:
https://github.com/kimiyoung/planetoid
70
Finally: GNN for your use
• Say good-bye for “D = {vector x_i, label y_i}” framework

with GNNs
– Applicable for generic structured data = graph
– The standard GCN is strong enough
• Many diverse application fields

– chemo, pharmacy, materials, computer vision, social
networks, …
71
Dataset for specific domains
• MoleculeNet [Wu18MoleculeNet]
– Molecular graph datasets from several chemical field.
TensorFlow implmenetaions attached
• Scene Graphs
– Visual Genome[Krishna17VG]: 108K images
• Pointcloud
– S3DIS [Armeni16S3DIS]: inside office
72
Programing Libraries
• Recommended: PyTorch Geometric

https://github.com/rusty1s/pytorch_geometric
• Deep Graph Library https://www.dgl.ai/
• Chainer Chemistry https://github.com/pfnet-

research/chainer-chemistry
– Tailored for molecular graphs, but also applicable for
other domains
73
References A-G
[Anderson19Cormorant] Anderson+, “Comorant: Covariant Molecular Neural Networks”, NeurIPS, 2019.
[Armeni16S3DIS] Armeni+, “3D Semantics Parsing of Large-scale Indoor Spaces”, CVPR, 2016.
[Bahdanau15Attention] Bahdanau+, “Neural Machine Translation by Jointly Learning to Align and Translate”, ICLR,
2015.
[Battaglia18survey] Battaglia+, “Relational Inductive Biases, Deep Learning, and Graph Networks”, arXiv:
1806.01261v3 [cs.LG], 2018.
[Bruna14Spectral] Bruna+, “Spectral Networks and Locally Connected Networks on Graphs”, ICLR, 2014.
[Chem19RGNN] Chen+, “On the Equivalence between Graph Isofmorphism Testing and Function Approximaion with
GNNs”, NeurIPS 2019.
[Chen20SimpleDeep] Chen+, “Simple and Deep Graph Convolutional Networks”, ICML 2020.
[DeCao19QA2] De Cao+, “Question Answering by Reasoning Across Documents with Graph Convolutional Networks“,
ICML Workshop, 2019.
[Defferrard16ChebNet] Defferrard+, “Convoluional Neural Networks on Graphs with Fast Localized Spectral Filtering”,
NIPS, 2016.
[Fout17Interface] Fout+, “Protein Interface Prediction using Graph Convolutional Networks”, NIPS, 2017.
[Gilmer17MPNN] Gilmer+, “Neural Message Passing for Quantum Chemistry”, ICML, 2017.
References H-K
[Hamilton17GraphSAGE] Hamilton+, “Inductive Representation Learning on Large Graphs”, NIPS 2017.
[Honda19GNN] Honda, “GNNまとめ(1-3)”, 2019. https://qiita.com/shionhonda/items/d27b8f13f7e9232a4ae5
https://qiita.com/shionhonda/items/0d747b00fe6ddaff26e2
https://qiita.com/shionhonda/items/e11a9cf4699878723844
[Hu20RandLaNet] Hu+, “RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds”, CVPR 2020.
[Jin17WLorg] Jin+, “Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network”, NIPS 2017.
[Jin18JTVAE] Jin+, “Junction Tree Variational Autoencoder for Molecular Graph Generation”, ICML 2018.
[Kipf18talk] Kipf, “Structured Deep Models: Deep learning on graphs and beyond”, 2018.
http://tkipf.github.io/misc/SlidesCambridge.pdf
[Kipf_Welling17GCN] Kipf and Welling, “Semi-supervised Classification with Graph Convolutional Networks”, ICLR
2017.
[Klicepra20DimeNet] Klicepra+, “Directional Message Passing for Molecular Graphs”, ICLR 2020.
[Krizhevsky12ConvNet] Krizhevsky+, “ImageNet Classification with Deep Convolutional Neural Networks”, NIPS 2012.
[Krishna17VG] Krishna+, “Visual genome: Connecting language and vision using crowdsourced dense image
annotations”, ICCV, 2017. 75
References L
[Landrieu_Boussaha19PCS] Landrieu and Boussaha, “Point Cloud Oversegmentation with Graph-Structured Deep
Metric Learning”, CVPR, 2019.
[Laugier18CGCNN] Laugier+, “Predicting thermoelectric properties from crystal graphs and material descriptors - first
application for functional materials”, NeurIPS Workshop, 2018.
[Li16GGNN] Li+, “Gated Graph Sequence Neural Networks”, ICLR, 2016.
[Li18Oversmoothing] Li+, “Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning”, arXiv:
1801.07606 [cs.LG], 2018.
[Li19DeepGCNs] Li+, “DeepGCNs: can GCNs go as deep as CNNs?”, ICCV 2019.
[Li20DeperGCN] Li+, “DeeperGCN: ALL You Need to Train Deeper GCNs”, arXiv: 2006.07739, 2020
[Lu16VRD] Lu+,”Visual Relationship Detection with Language Priors”, ECCV, 2016.
[Liu18CGVAE] Liu+, “Constrained Graph Variational Autoencoders for Molecule Design”, NeurIPS 2018.
76
References M-P
[Ma20tutorial] Ma+, “Graph Neural Networks: Models and Applications”, AAAI, 2020.
http://cse.msu.edu/~mayao4/tutorials/aaai2020/
[Madhawa19GNVP] Madhawa+, “GraphNVP: An Invertible Flow Model for Generating Molecular Graphs”, arXiv:
1905.11600, 2019.
[Marrn19NeurIPS] Marron+, “Provably Powerful Graph Networks”, NeurIPS, 2019.
[Maron19ICLR] Marron+, “Invariant and Equivariant Graph Networks”, ICLR 2019.
[Morris19kGNN] Morris+, “Weisfeiler and Lehman Go Neural: Higher-order Graph Neural Networks”, AAAI, 2019.
[Ngueyn_Maehara20Homomorph] Nguyen and Maehara, “Graph Homomorphism Network”, ICML, 2020.
[NT_Maehara19revisit] NT and Maehara, “Revisiting Graph Neural Networks: All We Have is Low-pass Filters”, arXiv:
1905.09550, 2019.
[Oono_Suzuki20Exponential] Oono and Suzuki, “Graph Neural Networks Exponentially Lose Expressive Power for
Node Classification”, ICLR 2020.
[Pei20GeomGCN] Pei+, “Geom-GCN: Geometric Graph Convolutional Networks”, ICLR, 2020.
77
References Q-V
[Qi19AttentiveRN] Qi+, “Attentive Relataional Networks for Mapping Images to Scene Graphs”, CVPR, 2019.
[Sanchez-Gonzalez20GNS] Sanchez-Gonzalez+, “Learning to Simulate Complex Physics with Graph Networks”,
arXiv: 2002.09405. 2020.
[Sanyal20PGCN] Sanyal+, “Protein GCN: Protein model quality assessment using graph convolutional networks”,
bioRxiv, 2020.
[Sato19CPNGNN] Sato+, “Approximation Ratios of Graph Neural Networks for Combinatorial Problems”, NeurIPS,
2019.
[Sato20Survey] Sato, “A Survey on the Expressive Power of Graph Neural Networks”, arXiv: 2003.04078.
[Scarselli08GNN] Scarselli+, “The Graph Neural Network Model”, IEEE Trans. Neural Networks, 20(1), pp. 61-80,
2008
[Schlichtkrull18RGCN] Modeling Relational Data with Graph Convolutional Networks, ESWC 2018.
[Shervashidze11WLKernel: Shervashidez+, “Weisfeiler-Lehman Graph Kernels”, JMLR, 12, pp.2539-2661, 2011.
[Shuman13Graph] Shuman+, “The emerging field of signal processing on graphs: extending high-dimensional data
analysis to networks and other irregular domains”, IEEE Signal Processing Magazine, 30(3), pp.83-98, 2013
[Togninalli18WWL] Togninalli+, “Wasserstein Weisfeiler-Lehman Graph Kernels”, NeurIPS 2018.
[Veličković18GAT] Veličković+, “Graph Attention Networks”, ICLR 2018.
References W-Z
[Wang18NLNN] Wang+, “Non-local Neural Networks”, CVPR 2018.
[Weisfeiler_Lehman68WL] Weisfeiler and Lehman, “A Reduction of a Graph to a Canonical Form and an Algebra
Arising during this Reduction”, Nauchno-Technicheskaya Informatsia, Ser.2(9), pp.12-16, 1968.
[Wu18MoleculeNet] Wu+, “MoleculeNet: a benchmark for molecular machine learning”, Chemical Science, 9(513),
2018.
[Wu19SGC] Wu+, “Simplifying Graph Convolutional Networks”, in Proc. ICML, 2019.
[Wu19survey] Wu+, “A Comprehensive Survey on Graph Neural Networks”, arXiv:1901.00596v1 [cs.LG], 2019.
[Xu19GIN] Xu+, “How powerful are Graph Neural Networks?”, in Proc. ICLR, 2019.
[Yang18GRCNN] Yang+, “Graph R-CNN for Scene Graph Generation”, ECCV, 2018.
[You18GCPN] You+, “Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation”, NeurIPS
2018.
[Zhao_Akoglu20PairNorm] Zhao and Akoglu, “PairNorm: Tackling Oversmoothing in GNNs”, ICLR, 2020.
[Zhou20Deep] Zhou+, “Effective Training Strategies for Deep Graph Neural Networks”, arXiv: 2006.07107, 2020.
[Zhou18survey] Zhou+, “Graph Neural Networks: A Review of Methods and Applications”, arXiv: 1812.08434v2
[cs.LG], 2018. 79
GNN research history
Model Research Deeper theory
Birth of GNN The Door Opener GNN extremes
Explosion / applications
Original GNN GraphSAGE

(2008) (NIPS17) GIN CPNGNN
GAT (ICLR19) (NeurIPS19)
(ICLR18)
Spectral GNN GCN k-GNN Homomorphis
(ICLR14) (ICLR17) MPNN (AAAI19) m (ICML20)
(ICML18)
ChebNet SGC DimeNet
NLNN (ICML19) (ICLR20)
(NIPS16)
(CVPR18)
Original GNN [Scarteslli08GNN] Spectral GNN [Bruna14Spectral] ChebNet[Defferrard16Cheb]

GGNN[Li16GGNN] GCN [Kipf_Welling17GCN] GraphSAGE [Hamilton17GraphSAGE] GAT [Veličković18GAT]
MPNN [Gilmer17MPNN] NLNN [Wang18NLNN] GIN [Xu19GIN] K-GNN [Morris19kGNN] SGC [Wu19SGC]
80
CPNGNN [Sato19CPNGNN] Graph Homomorphism Convolution [Ngueyn_Maehara20Homomorph] DimeNet [Klcepra20DimeNet]
Def: Graph signals
• Suppose the 1D node attribute case: N-array
High frequency graph signal

Low frequency graph signal
[Ma20tutorial]
81

Ishigurognnintroduction201023 201027054344

Uploaded by

Copyright:

Available Formats

Ishigurognnintroduction201023 201027054344

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ishigurognnintroduction201023 201027054344

Uploaded by

Copyright:

Available Formats

Modified from the course material of:

Nara Institute of Science and Technology

An Introduction to Graph Neural Networks:

Katsuhiko ISHIGURO, Ph. D (Preferred Networks, Inc.)

• Practical applications in industry, hard competitions in academia

• Model: The fundamental is approximated Graph Convolution

• Applications: applicable in several tasks in different domains

• Graph Neural Networks (GNN)s; what is it

• Application in Chemo-Informatics: Protein Interface Prediction

Many thanks for helpful comments and provided

Daisuke Okanohara, Shin-ichi Maeda, Kentaro Minami,

• Especially interested in Attributed graph

• Suitable for formulations/implementations of GNNs

N nodes, D dim. features

• H is informative and easy-to-handle “data” for machine-

Scene Graph Pointcloud [Landrieu_Boussaha19PCS]

Numbers of accepted papers: “title: Graph”

• Modify a signal f by a filter g, accumulate over ``shift’’ t

1D signal example f g f*g

Former (Non-DNN) top runners:

• L- iterative conv. Operations for image pixels

• No “axes” in graphs unlike sound signals (time) and

Spectral approach on Graphs

K-th frequency’sFourier base function

For Graphs, what is the signal? How the FT defined??

• Simply concatenate numbers

• D-dim attribute case  Feature matrix

Graph Fourier Transform(GFT)

Inner product between the graph

Spatial domain: x IGFT Spectral domain:𝒙

• Convolution is a multiplication in Fourier space

GFT of the graph signal

IGFT to get conv-filtered graph signal

(amplify each spectral coefficients)

 Simple linear algebra for a generic Graph Conv.

• Model N trainable param. with K (<<N) order

 Graph Laplacian is sparse in nature

Very simple GraphConv

• Weighted sum of neighbor nodes (shift) + self-link

Self-link Neighboring nodes

output: updated node vectors Input: node vectors @ layer

Trainable weight parameter

Fixed Adjacency parameter

Perform better than

Set of unlabeled nodes

Goal: predict the labels of unlabeled nodes yj

Train: tune parameters of the model to recover yi

The initial node vector

Classifier computes the label prob. distribution based on the final H

C={ , } q: trainable param of

Training: find the best parameters W(GCN) and q(classifier)

• Where is a binding site between two proteins?

• Predict the binding site regions of ligand-receptor

• Protein: a graph of inter-connected amino acids

Attributed Protein Graphs

Ligand protein graph

Binding sites: specific nodes interact between ligand and receptor

• Training: optimize parameters to predict the interaction