Spiking-GAN: A Spiking Generative Adversarial Network Using Time-To-First-Spike Coding
Spiking-GAN: A Spiking Generative Adversarial Network Using Time-To-First-Spike Coding
Spiking-GAN: A Spiking Generative Adversarial Network Using Time-To-First-Spike Coding
1 Introduction
Deep Neural Networks (DNNs) have significantly outperformed traditional algo-
rithms and have set new performance benchmarks in a plethora of applications.
However, the increase in complexity driven by its success has led to a keen inter-
est in low-power deep learning techniques, especially for mobile and embedded
applications. [1]
Spiking Neural Networks (SNNs) are third generation neural networks where
binary ‘spikes’ are the tokens of information. These networks are promising due
to their power efficiency. SNNs are biologically plausible and attempt to mimic
the neuronal dynamics of the brain. Information is coded in the form of spikes
(inspired by action potentials in the brain). Implementation of these networks on
event-driven neuromorphic hardware [2–4] has been found to be both fast and
energy efficient [5, 6]. However, SNNs are notoriously difficult to train. This is
primarily because the SNN neurons (activation functions) are non-differentiable
which makes it hard to propagate the error through the network [7, 8]. As a re-
sult most SNN applications have been limited to simple classification tasks [7,9].
2 V. Kotariya and U. Ganguly
There is a need to attempt a wider range of problems and explore more chal-
lenging tasks using SNNs.
Generative Adversarial Networks (GANs) [10] are currently one of the most
promising and extensively researched deep learning topics [11]. GANs primarily
consist of two networks, the generator and the discriminator, which compete
against each other. The generator is trained to try and deceive the discriminator
by generating synthetic/fake samples from a noise prior. The discriminator in
turn tries to distinguish the generator’s samples from the real data by classifying
them as fake or real. GANs have found an increasing number of applications [11]
like image generation, super-resolution, de-occlusion, image-to-image and text-
to-image translation and even drug discovery, among others. However, there is
no equivalent of a GAN in the spiking domain.
Most SNNs have predominantly employed spiking rate-based coding schemes.
Temporal coding provides an alternate framework. Time-to-first-spike (TTFS)
coding is a type of temporal coding in which the information is encoded in the
spike time of a neuron. It significantly increases the sparsity of the output spike
train, thereby giving large energy savings. It has recently been used in SNNs for
classification problems [12–14] and hardware implementation of such schemes is
also being explored [15]. There is also building evidence of the biological plausi-
bility of such temporal coding schemes [16, 17]. TTFS coding provides benefits
like increased sparsity and lower inference latency of the network over traditional
rate-based methods.
In this paper, we propose and demonstrate for the first time a TTFS-coding
based spiking implementation of a simple Generative Adversarial Network called
Spiking-GAN. We demonstrate the successful generation of good quality images
which look natural while improving the sparsity of representation and training.
2 Methods
We have adopted the learning rule and neuron model (sec. 2.2, 2.3 and 2.5)
described in S4NN [12] and modified it for our network.
X t
X
Vjl (t) = l
wji Sil−1 (τ ) (1)
i τ =1
if Vjl (t) > θjl , Vjl (t) is reset to its resting potential which is taken to be zero
and clamped to that voltage for the next tref time steps. Sil−1 (t) and wji l
are
th
the input spike train and the input synaptic weights from the i neuron in the
(l − 1)th layer.
Spiking-GAN: A Spiking Generative Adversarial Network 3
1.0
10 1.0
0
Fig. 1: Effect of refractory period on the spiking rate. (a) Membrane Potential of
IF neurons having different tref (0, 10, 100, 256) are shown for a constant unit
input (θ=10). (b) Maximum spiking rate trend with tref is shown.
255
0
0 tmax
Image Inverted time
Spike-time
Image Encoding
Fig. 2: Input Spike Encoder: Each input neuron spikes only once. The spike time
is set to the intensity of the corresponding pixel in the normalized inverted image
scaled by the simulation time (refer to eq.2). Larger value of a pixel corresponds
to an earlier spike in time.
4 V. Kotariya and U. Ganguly
1 spike
Time-to-First
Spike coding 0 N/tmax tmax t
N spikes
Rate coding
(Rate = N/tmax)
0 tmax
t
The input images are encoded to spikes using time-to-first-spike (TTFS) coding.
In TTFS coding, the information (here the pixel value) is encoded in the spike
time. Consider any channel of the input image with pixel values in the range [0,
Imax ]. The spike train Sjin (t) of the j th input neuron is given by :
Imax −Ij
(
1 if t = tmax
Sjin (t) = Imax
(3)
0 otherwise
Comparison with rate coding: The most common form of neural coding
in SNNs is rate coding. In such schemes, the information is encoded in the firing
rate of the neuron. This, however, means that the temporal information in the
spike trains is lost. In addition to that, the spiking rate has to be controlled
to ensure energy efficiency. In problems where the output is expected to be a
real number within a range (upto the required precision), controlling the spiking
rate is very difficult. For a simulation time of tmax , for the rate to be equal to
N/tmax , a neuron will have to spike N times. Such an encoding would be highly
inefficient and take away almost all of the energy benefits provided by SNNs.
In TTFS coding, only one spike is issued for any case to represent numbers
with the same precision, making it highly energy efficient (see fig.3). It gets rid of
all the extra (redundant) spikes which would be otherwise needed for encoding a
particular spike rate. In addition, for classification tasks, inference can be made
as soon as the first spike is issued in the output layer. This enables extremely
fast information processing in such networks leading to much lower latency.
Flatten 0 Discriminator
1 (D)
..
.
783 t
Spike Noise Output
0
0 ..
. “FAKE”
99 t 1 (⸪ t0 < t1)
t
100
0
spike-time
samples Generator
1
..
.
(G) Reshape
0 tmax 783 t
fake spike image Xf ake = G(z). This image has the same dimensions as the
input spike images. The fake spike image is then decoded in an exactly opposite
manner to the encoder to produce the fake image.
Discriminator: The training images are encoded into the real spike images
Xreal (see fig. 2). The discriminator (D) is a binary classifier which takes either
the fake spike image (z ∼ Pz (z)) from the Generator or the real spike image
(x ∼ Pdata (x)) as input. It has two output neurons: fake (0) and real (1). The
class label is determined by the spike timing (see fig.5). If the real neuron spikes
before the fake neuron, the input image is classified as real, else fake.
Desired
Desired Correct
Correct
Label Label Prediction
Prediction t t
t t
(N0) (N0) e0 e0 (N0) (N0) e0 = 0 e0 = 0
Incorrectly
Incorrectly
Predicted
Predicted t t (N1) (N1) t t
e1 e1 e1 e1
Label Label
(N1) (N01) τ 0 τ tmax tmax 0 τ 0 τ tmax tmax
(a) Incorrect class prediction case (b) Accurate Class Prediction case
(t0 > t1 ) (t0 < t1 )
Desired
Desired Correct
Correct
Fig. 5:Label
Temporal loss vectors in Spiking-GAN. (when
Neuron Ne0 = is
Neuron +ve
e0the
= +vee0 = 0 e0 =neuron)
desired 0
Label t t 0 t t
(a) N(N1 incorrectly spikes ebefore N (N0) (Ne01) is +ve because N1 fires.
0 , so e0 is +ve and
0) (N0) = γ e = γ
0 0
(b) N0 correctly spikes before N1 , so e0 is zero. However, e1 is +ve because N1
fires. e1 Predicted
is zero only if N1 doesn’t fire at all. Incorrect
Predicted Incorrect e1 = +ve e1 = +ve
Label Label t t Neuron
Neuron e1 t t e1
e1 = 0 e1 = 0
(N1) (N1) (N1) (N1)
τ = tmax
o τ = tmax
0 0 τ = min(t
τ =0,tmin(t
1) 0t,t 1)
max tmax
o
discriminator output neurons and Tgo = Tg,0 Tg,1 is the target firing time vec-
tor that the generator wants the discriminator to believe for it’s fake data. We
use dynamic target firing times that depend on the discriminator output. Let
the firing time of the winner output neuron be τ = min(Dt ) = min{to0 , to1 }. We
determine Tdo and Tgo by the following equations:
( (
o τ if x ∼ Xreal o tmax if x ∼ Xreal
Td,1 = and Td,0 = (6)
tmax if x ∼ Xf ake τ if x ∼ Xf ake
(
o τ if j = 1
Tg,j = (7)
tmax if j = 0
For the desired neuron, the target time is set to τ , so we are training it to emit a
spike first. For the other neuron, the target time is set to the maximum possible
value, the simulation time tmax (refer to fig.5). Note that the target firing time
values are flipped for the generator as compared to the discriminator’s values
for fake images, making the two systems adversaries. For the corner case when
neither of the output neurons spike, the firing times (Dt ) are assumed to be tmax
for loss calculations. Further, τ in eq.6-7 is set to 0, thereby heavily penalizing
the non-firing desiredP neuron.
P PThisl is2 to incentivize it to fire. We also add an L2
regularization term ( l j i (wij ) ) in the loss function.
Where appropriate L from eq.4-5 is chosen for training the respective network.
But calculating these weight updates is a problem. The IF neuron is not a
differentiable activation function. However, it approximates ReLU [12, 20]. For
a neuron which has a ReLU activation function, the output of the j th neuron of
the lth layer yjl is given by:
!
X
l l l l−1
yj = max 0, zj = wji xi (9)
i
where xl−1
i
th
is the input from the i neuron in the previous layer and wjil
is the
l l
corresponding input weight. zj for ReLU is equivalent to Vj in our model for a
given time step. For identical weights, larger input values correspond to a larger
value of yjl . Similarly, in the TTFS model where all the information is encoded
in the spike time, larger the input, higher is the membrane potential and earlier
is the spike. So, we assume an equivalence relationship between the firing time
of the IF neuron tlj and the corresponding output of a ReLU neuron yjl :
yjl ∼ tmax − tlj (10)
Each neuron only spikes once. So, for any neuron only the pre-synaptic inputs
that fire before it contribute to its spike time. So, using eq.10, we further assume:
(
∂tlj −1 if tlj < tmax
= (11)
∂Vjl 0 otherwise
Based on eq.1 & eq.9-11, the derivatives for ReLU and IF respectively would be:
(
∂yjl ∂yjl ∂zjl xl−1
i if yjl > 0
l
= l l
= (12)
∂wji ∂zj ∂wji 0 otherwise
tlj
l l l
∂tj ∂tj ∂Vj P
− Sil−1 (τ ) if tlj < tmax
l
= l l
= τ =1
(13)
∂wji ∂Vj ∂wji
0 otherwise
where δjl = ∂L/∂tlj . To calculate δjl , we backpropagate the gradient (using chain
rule) for the hidden layers:
X ∂L ∂tl+1 ∂V l+1 !
l l k k
δj = ∂L/∂tj =
∂tl+1
k ∂V k
l+1 ∂tl
j
k
P
δkl+1 wkjl+1
if tlj < tl+1
k
= k (15)
0 otherwise
8 V. Kotariya and U. Ganguly
∂L(D)
δjo (D) = = [Dt (x) − Tdo ]x∼Pdata (x) + [Dt (G(z)) − Tdo ]z∼Pz (z) (16)
∂toj
∂L(G)
δjo (G) = = Dt (G(z)) − Tgo z∼P (z)
(17)
∂toj z
2.6 Training
The Discriminator (D) and Generator (G) are trained together in an alternating
manner. While training D, the fake images generated by G are used but weights
of G are not altered. Similarly, while training G, the prediction of D is used
and the loss is backpropagated through D to G but the weights of D are kept
unchanged. We train both G and D once every epoch. Once trained, the generator
G which now (approximately) samples from the same distribution as the training
dataset is used to create fake samples.
3 Results
3.1 Classification
200
No. of spikes ->
80
150
60
100
40
20 50
0 0
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
Digit Digit
(a) Mean inference time comparison (b) Mean spike count comparison
Fig. 6: Comparison of the mean inference time and mean spike count in the whole
network needed for classification of each digit (with S4NN) using a 2-layer dense
(784-400-10) network. Our loss function yields better performance for every digit.
Spiking-GAN: A Spiking Generative Adversarial Network 9
Fig. 7: Comparison between 15 randomly picked filters from the 400 (28x28)
hidden weights of a 784-400-10 dense network trained for MNIST classification.
Filters learned using our method are significantly less noisy.
to sec.2.4). This significantly reduces the inference latency and increases the
sparsity of the network as compared to the relative margin-based objective in
S4NN which imposes only a small penalty on the ‘incorrect’ neurons and that
too only when they fire very close to the ground truth neuron. It also has a much
lower penalty when the desired neuron doesn’t fire at all.
Fig.6(a),(b) show the digit-by-digit comparisons for the same. Our network,
on an average, takes 59.7 time steps and needs only 194 spikes to make a decision
on the class label, which is 33.4% faster inference in 11.1% fewer spikes (S4NN
needs 218.3 spikes and takes 89.7 time steps on an average). So, on an average
only 16.2% of the neurons in the network (1194) fire and the network makes its
decision in 23.3% of the simulation time (256). Note that the network decision
can be taken as soon as any of the output neuron spikes and all the spike counts
are the average number of spikes issued in the whole network till that time step.
Fig.7(a),(b) compares the filters learned by the two methods. 15 randomly chosen
(28x28) weights from the hidden layer are shown. Aggressive TTFS results in
qualitatively superior filters, they are much sharper and less noisy. However, the
increased sparsity and faster inference lead to a slightly lower test accuracy of
96.7% (as compared to 97.4%). Though with a bigger network 784-1000-10, we
were able to achieve a test accuracy of 97.6%, which surpasses the classification
performance of other temporal coding-based works that use simple instantaneous
synaptic current kernels.
3.2 Spiking-GAN
We trained our network on the MNIST dataset for each individual digit. The
generator is a 2-layer fully connected (dense) network (100-400-784). 100 spike
noise images are given as an input and the flattened generated fake sample is
the output. The discriminator is a 2-layer fully connected (dense) network (784-
400-2) which takes a flattened spike train as input. Fig.8(c) shows some selected
samples generated by G after training the network for 50 epochs. As can be seen
the samples generated are of high quality.
10 V. Kotariya and U. Ganguly
Fig. 8: Comparison between the training images, some of the good individual
digit samples generated by the Spiking-GAN and some of the selected samples
from an equivalent ANN-based GAN.
Spiking-GAN: A Spiking Generative Adversarial Network 11
0 10 20 … 40 50 epochs
Fig. 9: (a) Trend of generated image quality with training epochs. (b) Selected
interpretable ‘undesired’ samples generated by a trained Spiking-GAN.
4 Conclusion
In this paper, we have demonstrated the viability of implementing a generative
adversarial network in the spiking domain. The network encodes and trans-
fer information in a highly sparse manner using TTFS coding. We trained the
Spiking-GAN using spike timing-based learning rules. Such a framework could
be easily adapted and extended to realize other generative networks like varia-
tional autoencoders (VAEs), conditional-GANs and auxiliary classifier-GANs in
the spiking domain. More importantly, it can prove very useful for solving other
regression problems like image translation, image segmentation and even com-
bined classification-regression tasks like object detection in the spiking domain
using spike-based learning rules. At the same time, ensuring the sparsity, energy
efficiency and low latency while realizing such networks.
References
1. Goel, A., Tung, C., Lu, Y.H., Thiruvathukal, G.K.: A survey of methods for low-
power deep learning and computer vision. In: 2020 IEEE 6th World Forum on
Internet of Things (WF-IoT). pp. 1–6 (2020)
2. Davies, M., Srinivasa, N., Lin, T.H., Chinya, G., Cao, Y., Choday, S.H., Dimou, G.,
Joshi, P., Imam, N., Jain, S., Liao, Y., Lin, C.K., Lines, A., Liu, R., Mathaikutty,
D., McCoy, S., Paul, A., Tse, J., Venkataramanan, G., Weng, Y.H., Wild, A., Yang,
Y., Wang, H.: Loihi: A neuromorphic manycore processor with on-chip learning.
IEEE Micro 38(1), 82–99 (2018)
3. DeBole, M.V., Taba, B., Amir, A., Akopyan, F., Andreopoulos, A., Risk, W.P.,
Kusnitz, J., Ortega Otero, C., Nayak, T.K., Appuswamy, R., Carlson, P.J., Cassidy,
A.S., Datta, P., Esser, S.K., Garreau, G.J., Holland, K.L., Lekuch, S., Mastro, M.,
McKinstry, J., di Nolfo, C., Paulovicks, B., Sawada, J., Schleupen, K., Shaw, B.G.,
Klamo, J.L., Flickner, M.D., Arthur, J.V., Modha, D.S.: Truenorth: Accelerating
from zero to 64 million neurons in 10 years. Computer 52(5), 20–29 (2019)
4. Painkras, E., Plana, L.A., Garside, J., Temple, S., Galluppi, F., Patterson, C.,
Lester, D.R., Brown, A.D., Furber, S.B.: Spinnaker: A 1-w 18-core system-on-
chip for massively-parallel neural network simulation. IEEE Journal of Solid-State
Circuits 48(8), 1943–1953 (2013)
5. Kim, S., Park, S., Na, B., Yoon, S.: Spiking-yolo: Spiking neural network for energy-
efficient object detection. Proceedings of the AAAI Conference on Artificial Intelli-
gence 34(07), 11270–11277 (Apr 2020), https://ojs.aaai.org/index.php/AAAI/
article/view/6787
12 V. Kotariya and U. Ganguly
6. Rajendran, B., Sebastian, A., Schmuker, M., Srinivasa, N., Eleftheriou, E.: Low-
power neuromorphic hardware for signal processing applications: A review of ar-
chitectural and system-level design approaches. IEEE Signal Processing Magazine
36(6), 97–110 (2019)
7. Tavanaei, A., Ghodrati, M., Kheradpisheh, S.R., Masquelier, T., Maida, A.: Deep
learning in spiking neural networks. Neural Networks 111, 47–63 (2019), https:
//www.sciencedirect.com/science/article/pii/S0893608018303332
8. Pfeiffer, M., Pfeil, T.: Deep learning with spiking neurons: Opportunities and chal-
lenges. Frontiers in Neuroscience 12, 774 (2018), https://www.frontiersin.org/
article/10.3389/fnins.2018.00774
9. Illing, B., Gerstner, W., Brea, J.: Biologically plausible deep learning — but how
far can we go with shallow networks? Neural Networks 118, 90–101 (2019), https:
//www.sciencedirect.com/science/article/pii/S0893608019301741
10. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair,
S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of the
27th International Conference on Neural Information Processing Systems - Volume
2. p. 2672–2680. NIPS’14, MIT Press, Cambridge, MA, USA (2014)
11. Alqahtani, H., Kavakli-Thorne, M., Kumar, G.: Applications of generative adver-
sarial networks (gans): An updated review. Archives of Computational Methods
in Engineering 28(2), 525–552 (Mar 2021)
12. Kheradpisheh, S.R., Masquelier, T.: Temporal backpropagation for spiking neu-
ral networks with one spike per neuron. International Journal of Neural Systems
30(06), 2050027 (2020), pMID: 32466691
13. Park, S., Kim, S., Na, B., Yoon, S.: T2fsnn: Deep spiking neural networks with time-
to-first-spike coding. In: 2020 57th ACM/IEEE Design Automation Conference
(DAC). pp. 1–6 (2020)
14. Rueckauer, B., Liu, S.C.: Conversion of analog to spiking neural networks using
sparse temporal coding. In: 2018 IEEE International Symposium on Circuits and
Systems (ISCAS). pp. 1–5 (2018)
15. Oh, S., Kwon, D., Yeom, G., Kang, W.M., Lee, S., Woo, S.Y., Kim, J.S., Park,
M.K., Lee, J.H.: Hardware implementation of spiking neural networks using time-
to-first-spike encoding (2020)
16. Rullen, R.V., Thorpe, S.J.: Rate coding versus temporal order coding: What the
retinal ganglion cells tell the visual cortex. Neural Computation 13(6), 1255–1283,
https://doi.org/10.1162/08997660152002852
17. Tuckwell, H.C., Wan, F.Y.: Time to first spike in stochastic hodgkin–huxley sys-
tems. Physica A: Statistical Mechanics and its Applications 351(2), 427–438 (2005),
https://www.sciencedirect.com/science/article/pii/S0378437104015353
18. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial net-
works. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International
Conference on Machine Learning. Proceedings of Machine Learning Research,
vol. 70, pp. 214–223. PMLR (06–11 Aug 2017), http://proceedings.mlr.press/
v70/arjovsky17a.html
19. Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Smolley, S.P.: Least squares gener-
ative adversarial networks. In: 2017 IEEE International Conference on Computer
Vision (ICCV). pp. 2813–2821 (2017)
20. Tavanaei, A., Maida, A.: Bp-stdp: Approximating backpropagation using spike
timing dependent plasticity. Neurocomputing 330, 39–47 (2019), https://www.
sciencedirect.com/science/article/pii/S0925231218313420
Spiking-GAN: A Spiking Generative Adversarial Network 13
21. Kim, J., Kim, K., Kim, J.J.: Unifying activation- and timing-based learning rules
for spiking neural networks. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan,
M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems. vol. 33,
pp. 19534–19544. Curran Associates, Inc. (2020), https://proceedings.neurips.
cc/paper/2020/file/e2e5096d574976e8f115a8f1e0ffb52b-Paper.pdf