It Ain't Much But It's ONNX Work
It Ain't Much But It's ONNX Work
It Ain't Much But It's ONNX Work
Abstract
We present a tool, ONNXplorer for visualizing the network topology of
arbitrary Deep Neural Networks up to a given level of complexity. The
tool makes use of Virtual Reality (VR) technology via the Unity gaming
engine, and allows compatibility with different machine learning
frameworks through the ONNX intermediate representation.
The tool shows the topology of the networks together with the neuron
activations for a particular input.
1. Introduction
How can we visualize what’s going on in a Deep Neural Network? With so many layers
and so many connections between each neuron in each layer, things can get pretty
confusing. We turn to Virtual Reality (VR), and some techniques for efficiently
processing data and culling the less interesting parts of the structure.
We chose to standardize on the ONNX intermediate format for models, allowing support
for models generated with different frameworks (e.g. PyTorch and Tensorflow, which are
both compatible with ONNX). In addition, the ONNX runtime contains C# bindings
which allowed compatibility with the Unity gaming/VR framework.
2. Methods
Replicating the results
GitHub: https://github.com/onnxplorer/ONNXplorer
1
Research conducted at the Apart Research Alignment Jam #8 (Safety Verification), 2023 (see
https://alignmentjam.com/jam/verification)
1
Follow the README (remember to run download_dependencies and download_models
before starting Unity).
Methods
We used Unity 2022.1.14f1 as our development framework, and C# as our programming
language.
Development History
I’ll spare you the stories of installation hassles regarding Unity and C# libraries from
NuGet, or trying to make certain types of shader function in VR.
Development was divided between both team members, with one primarily handling VR
rendering and interface, with the other primarily handling data processing and model
digestion, and both collaborating on particularly troublesome problems. Entertainingly, a
fair fraction of the code - not to mention the name of the project! - was generated by
ChatGPT and Github Copilot: several shaders, some rendering code, a PriorityQueue, and
so forth. The code sometimes needed some tweaks, and sometimes was simply wrong,
but often provided the key insights the developer had been missing.
One problem we faced was extracting the shape of the hidden layers. These are not
encoded explicitly in the ONNX format, and instead are calculated by the ONNX
Runtime (and in fact can depend on the shape of the input data, e.g. batch size). We wrote
some code to calculate shapes and propagate constants for some of the common ONNX
operators. This was sufficient to compute the shapes of the tensors for the MobileNet
model.
Initially we created a separate C# object per neuron and per connection. We positioned
each tensor according to its layer number in the X direction, and each neuron randomly in
the Y and Z directions.
Negative results
We initially had some serious performance problems. While the system could run fine up
to a few hundred thousand elements, once it got into the millions it began to suffer, and
we along with it. Considering that machine learning models routinely contain tens or
hundreds of millions of elements, this was problematic, and threatened to limit the
usability of the system to overly simple examples. While we don’t expect our system to
load the largest models, we wanted to have a decent range, at least.
Unity Hang
Constructing C# objects for every neuron and connection turned out to be too
computationally expensive, locking up the Unity runtime while it ran, and consuming far
too much memory, limiting the size of the model we could display. We could work
around the lockups by introducing threading, but it still ran too slow to be able to iterate.
Demonstration video:
https://youtu.be/6wkNMwZ_VAU
5. References
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang,
Tobias Weyand, Marco Andreetto, Hartwig Adam (2017). MobileNets: Efficient
Convolutional Neural Networks for Mobile Vision Applications, (arXiv:
1704.04861) https://arxiv.org/abs/1704.04861