AI Paper

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Applied Intelligence (2023) 53:11142–11161

https://doi.org/10.1007/s10489-022-03902-9

ADAM-DPGAN: a differential private mechanism for generative


adversarial network
Maryam Azadmanesh1 · Behrouz Sahgholi Ghahfarokhi1 · Maede Ashouri Talouki1

Accepted: 14 June 2022 / Published online: 1 September 2022


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022

Abstract
Privacy preserving data release is a major concern of many data mining applications. Using Generative Adversarial
Networks (GANs) to generate an unlimited number of synthetic samples is a popular replacement for data sharing. However,
GAN models are known to implicitly memorize details of sensitive data used for training. To this end, this paper proposes
ADAM-DPGAN, which guarantees differential privacy of training data for GAN models. ADAM-DPGAN specifies the
maximum effect of each sensitive training record on the model parameters at each step of the learning procedure when
the Adam optimizer is used, and adds appropriate noise to the parameters during the training procedure. ADAM-DPGAN
leverages Rényi differential privacy account to track the spent privacy budgets. In contrast to prior work, by accurately
determining the effect of each training record, this method can distort parameters more precisely and generate higher
quality outputs while preserving the convergence properties of GAN counterparts without privacy leakage as proved.
Through experimental evaluations on different image datasets, the ADAM-DPGAN is compared to previous methods and
the superiority of the ADAM-DPGAN over the previous methods is demonstrated in terms of visual quality, realism and
diversity of generated samples, convergence of training, and resistance to membership inference attacks.

Keywords Differential privacy · Generative adversarial network · Deep learning · Information leakage

1 Introduction these techniques depends on the background knowledge


of the adversaries. Furthermore, these simple anonymous
Nowadays, deep learning brings many benefits in various techniques cannot address the data scarcity problem, which
applications such as marketing and business decisions, is a main issue in some applications such as medical
medical diagnosis and treatment, cybersecurity and fraud diagnosis and treatment. Thus, the privacy of releasing real
detection. Using large datasets to train deep learning models datasets remains a major concern.
is one of the important factors in the success of these Fortunately, generative models can help to solve the data
models. But there are obstacles for providing large datasets scarcity issue. A generative model is a powerful way of
to train these models, including privacy concerns and data learning the training data distribution and then allowing the
scarcity. To protect privacy, numerous mechanisms such generation of the unlimited number of synthetic samples.
as l-diversity [1] and t-closeness [2] have been developed Generative Adversarial Network (GAN) [3] is a popular
over the years. However, the privacy level provided by class of generative models that has gained significant
attention in recent years. GAN and its variants have been
 Behrouz Sahgholi Ghahfarokhi used in many applications and can generate synthetic data
shahgholi@eng.ui.ac.ir very similar to the training data. The GAN architecture
typically comprises two neural networks, a generator and
Maryam Azadmanesh a discriminator. The discriminator’s task is separating
m.azadmanesh@eng.ui.ac.ir
generated samples from the training ones, and the generator
Maede Ashouri Talouki tries to deceive the discriminator by synthesizing samples
m.ashouri@eng.ui.ac.ir that the discriminator misclassifies.
Although GANs can alleviate the data scarcity issue,
1 Faculty of Computer Engineering, University of Isfahan, they do not guarantee the privacy, and they are known to
Isfahan, 8174673441, Isfahan, Iran implicitly memorize private details of the sensitive training
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11143

data. Different attacks [4–6] are conducted against these Through experimental evaluations, ADAM-DPGAN is also
models, which can infer information about training datasets. compared with GANobfuscator method [7], GS-WGAN
Therefore, it is highly demanded to incorporate privacy method [12] and the method of reference [13] on MNIST,
mechanisms with GANs. Fashion-MNIST and CelebA datasets. Experimental results
For this reason, researches [7–14] have recently been demonstrate the superiority of the ADAM-DPGAN over
conducted on the combination of privacy mechanisms the prior methods in terms of visual quality, realism and
with GAN models. These researches generally fall into diversity of generated samples, convergence of training and
two categories. The first category focuses on applying a resistance to membership inference attacks. In summary, the
strong privacy standard, i.e. differential privacy, to GAN contributions of this paper are as below:
models [7–13] and the second category only defends
1. Determining the global sensitivity of GAN model
against a particular attack [14]. Although the first category
parameters in each training step when Adam optimizer
provides strong privacy, some of these approaches [7–9]
is used.
suffer from problems such as low synthetic sample quality,
2. Presenting ADAM-DPGAN algorithm, which uses the
low convergence rate and the need to fine-tune hyper-
introduced sensitivity to guarantee differential privacy
parameters, while the others [10–13] provide privacy only
for GAN models.
for the generator network in GAN model. On the other
3. Proving the differential privacy feature of the ADAM-
hand, the second category’s defense mechanisms make
DPGAN.
assumptions about the adversary’s knowledge and are not
4. Evaluating ADAM-DPGAN under various image
universal solutions. Therefore the need for a solution that
datasets and network structures and demonstrating its
can provide differential privacy for both discriminator and
performance with reasonable privacy budgets.
generator, while generating high-quality synthetic samples
without a low training convergence rate, is an open problem. The remainder of the paper is organized as follows.
To this end, in this paper, ADAM-DPGAN, a differential Section 2 reviews the related work. Section 3 introduces
private method for GAN models is proposed, which GAN and relevant concepts used in the paper. Section 4
provides differential privacy for both the discriminator presents the details of the proposed ADAM-DPGAN
and the generator networks when Adam optimizer [15] method. In Section 5, the proposed method is evaluated and,
is used. In contrast to previous work, ADAM-DPGAN finally in Section 6, the findings are summarized, and the
accurately determines the impact of each sensitive data conclusion is presented.
sample on the model parameters at each step of the learning
procedure and adds appropriate noises to the parameters
to guarantee differential privacy of training data samples 2 Related work
without meaningfully impacting the quality of the final
synthesized outputs. By specifying the maximum effect This paper is mostly related to two strands of literature.
of each sample on the model parameters, the need for First, the literature which involves attacks conducted against
traditional expensive computational operations such as per- machine learning (ML) models to infer information about
example calculation of gradients [7, 8], changing the GAN their training dataset; second, the literature that is about
architecture (e.g. additional neural network training) [9–12] designing privacy preserving mechanisms for ML models.
and clipping the parameters/gradients during the training The following two subsections overview the related work on
procedure [7, 8] is eliminated. As a result, there is no these areas.
need for public data access to adaptively adjust the hyper-
parameters such as clipping bound as in done in [7]. 2.1 Privacy attacks against ML models
Moreover, ADAM-DPGAN guarantees differential privacy
for both the discriminator and the generator networks, Membership inference attack and model inversion attack
and unlike [14], it has no assumptions about attacker’s are two main attacks that infer information about training
knowledge, and unlike [10–13], it can be used when data from the trained model. In a model inversion attack,
the discriminator network needs to be differential private. the attacker uses the trained model’s output to derive
ADAM-DPGAN is built upon the improved Wasserstein the values of the training records’ attributes. A model
Generative Adversarial Network (WGAN) [16], but the idea inversion attack against neural networks is introduced by
can be used with any types of models, even discriminative [20], where the attacker finds the input that maximizes
models, which use Adam optimizer during training. To the returned classification confidence value. Later, Yang
measure privacy loss, ADAM-DPGAN leverages Rényi et al. [21] propose a model inversion attack under an
Differential privacy account [17] and can generate high adversarial setting where a second neural network is trained
quality synthetic samples with reasonable privacy budgets. to reconstruct the sample based on the prediction vector.
11144 M. Azadmanesh et al.

In a membership inference attack, given an input record architecture and training dynamics. Furthermore, the effect
and access to the trained model, the attacker determines if of per-example clipping is theoretically examined in [33]
the record was in the model’s training dataset or not. The and it is theoretically confirmed that it causes slower
first membership inference attack against neural network convergence than its non-private counterpart. In the GAN
models was introduced by [22]. Yeom et al. [23] formulate training, clipping the gradients and adding noise based
the quantitative advantage of adversaries for membership on clipping bounds lead to lower convergence, training
inference attack in terms of generalization error and instability and low-quality synthetic samples. Although
influence. Sablyarose et al. [24] exploit a probabilistic GANObfsure exploits small public data to tackle these
framework to drive an optimal strategy for membership problems, the availability of such public data is not a
inference attack. Nasr et al. [25] present white box practical assumption for many applications.
membership inference attacks againts deep neural networks. Input perturbation is used by Papernot et al. [27] to
The first membership inference attack against genera- provide differential privacy. In their method, which is called
tive models was introduced by [4] in which the authors Private Aggregator of Teacher Ensembles (PATE), multiple
use the discriminator’s output to learn statistical differ- teacher models are trained on disjoint partitions of the
ences between training data members and non-members. training data, and a differential private student model is
Hilprecht et al. [5] conduct an inference membership attack trained on the public data labeled by the noisy voting
using generated samples of the model where a record with among all of the teachers. Scalable PATE [28] improves
the largest number of the nearest generated samples is the utility of PATE with a Confident-GNMax aggregator.
inferred as a membership record. Later, Chen et al. [6] However, both PATE and Scalable PATE require unlabeled
extend Hilprecht’s attack to the white box, partial black box public data availability, and their aggregators can only be
generator setting and full black box setting. applied to categorical data. PATE-GAN [9] is a modified
version of PATE. In PATE-GAN, K teacher-discriminators
2.2 Privacy preserving mechanisms in ML models and one student-discriminator are trained, and the student-
discriminator is trained on the generated synthetic samples
Many approaches have been developed which protect ML labeled by the teachers. In PATE-GAN, the need for public
models against privacy attacks. These approaches generally data access has been resolved, but at the beginning of
fall into two broad categories: 1) differential private the training, the generator cannot generate enough samples
approaches and 2) empirical approaches. The first category labeled as real samples by the teachers and as the training
provides a strong theoretical guarantee for privacy but low progresses, the opposite is true. So it seems the student-
synthetic sample quality. In contrast, the second category generator training procedure may fail to learn the real
cannot guarantee strict privacy and it empirically protects distribution without seeing real samples.
training data against a particular attack while imposing a PATE method has also been combined with gradient
negligible utility loss. perturbation to guarantee differential privacy only for
In the first category, the efforts are mostly based on the generator. In G-PATE [10], aggregated information
gradient perturbation [7, 8, 10–12, 26], input perturbation provided by teacher-discriminators are used to train a
[9, 27, 28] and objective perturbation [29]. student- generator. DATALENS [11] improved the utility of
Abadi et al. [26] propose gradient perturbation to protect G-PATE [10] with top-k gradient compression. GS-WGAN
privacy leakage of training data on the discriminative [12] is another method based on the combination of PATE
models. In their method, per-example gradients of the with gradient perturbation. In this method, at each training
loss function are computed, and the gradients are clipped step, a randomly selected discriminator and the generator
based on clipping bounds and then, to guarantee differential update their parameters. To prevent information leakage
privacy, random noise is added to the clipped gradients. from the selected teacher to the student-generator, Abadi’s
Abadi’s method is also adapted for GAN models, named method [26] with improved WGAN[16] is used, and the
DP-SGD GAN [7, 8]. GANObfscure [7] uses gradient gradient clipping bound is set to one. Han et al. [13]
perturbation in the discriminator of improved WGAN [16]. propose another method which provides privacy guarantee
Torkzadehmahani et al. [8] exploit Abadi’s method with for the generator network. In this method, in each update
conditional GAN [30] to generate synthetic samples with of generator parameters, the discriminator loss is clipped
corresponding labels in a differential private manner. While and the appropriate noise is added to it. Ensuring privacy
the above methods provide privacy guarantee, per example only for the generator is one of the shortcomings of these
calculation of gradients has a computational overhead [31, methods.
32]. Moreover, clipping bounds have a considerable impact Objective perturbation is another method that falls in the
on the model’s performance, and optimal bounds depend first category. Phan et al. [29] exploit objective perturbation
on many hyper-parameters such as learning rate, model’s to guarantee differential privacy for a deep auto-encoder.
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11145

While their approach provides privacy for auto-encoder and G : Y → Y  is an arbitrary data-independent mapping,
models, it cannot be generalized to other types of deep then GoF : X → Y  also satisfies (, δ)-differential
neural networks. privacy.
As described, the second category of defense mech- Gaussian noise-based mechanism is one of the common
anisms only focuses on mitigating a particular type of mechanisms which gives differential privacy for a real-
attack. Nasr et al. [34] introduce a privacy model that is valued function by adding Gaussian noise scaled to the
robust against membership inference attacks with black box sensitivity of function. Sensitivity of function f (i.e. S f ) is
accesses. To do this, they designed a multi-objective learn- the maximum distance between its output for two adjacent
ing algorithm with the goal of minimizing the classification inputs. Formally, S f is defined as:
loss and the maximum gain of the membership infer-
ence attack. MemGaurd [35] is another defense mechanism S f = maxadj acentX,X | f (X) − f (X ) | (2)
against the black box membership inference attack, which Regarding sensitivity S f , Gaussian noise-based mechanism
adds a noise vector to the predicted classification confi- is formulated as:
dence score. Yang et al. [36] introduce a filtering framework
to defend againts model inversion and membership infer- F (X) = f (X) + N(0, S f 2 σ 2 ) (3)
ence attacks. PrivGAN [14] is another empirical defense
for GAN models which defends against membership infer- where N(0, S f 2 σ 2 ) is the added noise randomly selected
ence attack. The fact that these methods only defend against according to a normal distribution with mean 0 and standard
a particular attack is their shortcoming compared to the deviation S f σ . A Gaussian mechanism complies (, δ)-
2ln( 1.25 )
method presented in this paper. differential privacy if σ 2 ≥ 
δ
[38].
Composability is a feature of differential privacy
that enables combination of multiple differential private
3 Preliminaries mechanisms into one. However, a basic problem is tracking
the overall privacy loss in the composite mechanism.
In this section, differential privacy [37–39] and related For this purpose, several methods, including advanced
concepts are reviewed. Then GAN [3] and its variants composition theorems [39], moments account method [26]
are introduced; and finally, Adam optimization [15] is and Rényi Differential Privacy (RDP) account method [40]
discussed. have been developed. Moments account [26] and RDP
account [40] work particularly well with uniform subsample
3.1 Differential privacy Gaussian mechanisms, and RDP provides a more accurate
bound for privacy loss than moment account [17]. RDP
Differential privacy is a strong standard to quantify account is defined in terms of Rényi divergence.
individual’s privacy loss for algorithms on aggregated data.
Differential privacy is defined using adjacent databases. In Definition 2 (Rényi divergence [40]) The lambda-order
ML application, adjacent databases refer to two training Rényi divergence between two distributions P and P  is
datasets that differ by one training record. Informally, a stated as:
differential private algorithm is a randomized algorithm that
1 P (x)
its output is nearly identical on the adjacent datasets. The Dλ (P P  )  logEx∼P  [ (  ) λ ]
λ−1 P (x)
formal definition of differential privacy is presented below.
1 P (x)
= logEx∼P [ (  ) λ−1 ] (4)
Definition 1 ((, δ)-Differential-privacy [37]) A random- λ−1 P (x)
ized mechanism F : X → Y complies (, δ)-differential
Definition 3 (Rényi Differential privacy [40]) A random-
privacy if for any two neighboring inputs X, X  ∈ X and for
ized mechanism F is (λ, )-RDP with λ ≥ 1 if for
any possible set of outputs S ⊆ Y ,
any neighboring datasets X and X , the Rényi divergence
P r[F (X) ∈ S] ≤ e P r[F (X ) ∈ S] + δ (1) between F (X) and F (X ) satisfied the following equation:
Here,  and δ are referred to as privacy budget and 1
Dλ ( F ( X) F ( X ) ) = logEy∼F ( X)
confidence parameter, respectively. λ−1
P r[ F (X) = y] λ−1
[( ) ] ≤  (5)
Post-processing is a useful feature of differential privacy P r[ F (X ) = y]
[38]. It means that any function of the output of the
differential private algorithm will not invade privacy. In this paper, we exploit the following two properties
Formally, if F : X → Y satisfies (, δ)-differential privacy that allow some combination of RDP mechanisms and
11146 M. Azadmanesh et al.

also conversion of RDP bound to (, δ)-differential privacy Distance between the synthesized distribution and the true
bound as proved by Mironov et al. [40]. distribution rather than Jensen-Shannon divergence as in the
original GAN formulation. The objective function of the
Theorem 1 (Composition RDP [40]) Let F1 , ..., Fk be a improved WGAN [16] is given by:
sequence of (λ, i )-RDP mechanisms, then the composition min max Ex∼pdata [ DθD ( x) ) ] −Ez∼pz [ DθD ( GθG ( z) ) ]

k θG θD
Fk ◦ ... ◦ F1 guarantees (λ, i )-RDP.
i=1 +λ( ∇x̃ DθD ( x̃) 2 − 1) 2 (7)

θG and θD represent the parameters of the generator network


Theorem 2 (Converting RDP to DP [40]) If a mechanism
log 1δ
and the discriminator network, respectively. Also, x̃ = x +
F is (λ, )-RDP, then F complies ( + λ−1 , δ)-differential (1 − )GθG (z), where  is a random number sampled from
privacy for any δ ∈ (0, 1). [0, 1] according to a uniform distribution.

3.2 GAN and its variants 3.3 Adam optimization

GAN [3] architecture typically comprises two neural Adam is a method that adapts the learning rate of
networks, a generator G and a discriminator D, in which G each neural network weight using the estimation of first
learns to map from a latent distribution pz to the true data and second moments of the gradient. To estimate those
distribution pdata , while D discriminates between instances moments, exponentially moving average of the gradient and
sampled from pdata and those generated by G. G’s objective squared gradient are assessed on current mini-batch [15] as
is to “deceive” D by synthesizing instances that appear to denoted by mt and vt :
be from pdata . The training goal is formulated as
mt = β1 mt−1 + (1 − β1 )gt (8)
min max Ex∼pdata [ log( DθD ( x) ) ]
θG θD
+Ez∼pz [ log( 1 − DθD ( GθG ( z) ) ) ] (6) vt = β2 vt−1 + (1 − β2 )gt2 (9)
where θG and θD represent the parameters of the generator where β1 , β2 ∈ [0, 1). Since the moment estimators
network and the discriminator network, respectively [3]. are biased toward zeros, Adam uses the bias-corrected
Figure 1 shows the GAN architecture. estimation (m̂t and v̂t defined in (10) and (11)):
Despite its simplicity, the original GAN formulation is
unstable and inefficient to train. A number of subsequent mt
m̂t = (10)
studies propose new training procedures and network 1 − β1t
architectures to improve the stability and convergence
vt
rate. In particular, WGAN [41] and improved training v̂t = (11)
of WGANs [16] attempt to minimize the Earth Mover’s 1 − β2t

Fig. 1 GAN Architecture


ADAM-DPGAN: a differential private mechanism for generative adversarial network 11147

Using these moving averages, Adam’s parameters are are used to track the accumulated privacy (line 8), which
updated through (12): will be explained in detail in Section 4.2. This process is
m̂t iterated Id times (the number of discriminator iterations
wt = wt−1 − α  (12) per generator iteration) and then generator parameters
v̂t + γ are updated (lines 10-12). The optimization procedure is
where γ is a small constant for ensuring stability and α is repeated until reaching convergence.
the stepsize [15].

4 Proposed method

Following, the first subsection describes the ADAM-


DPGAN algorithm and the second subsection has been
dedicated to privacy analysis of ADAM-DPGAN.

4.1 ADAM-DPGAN algorithm

The goal of our proposed ADAM-DPGAN algorithm is


to train GANs with respect to the fact that the privacy
leakage on each individual record of the training dataset
is guaranteed regarding differential privacy constraint. As
Fig. 1 shows, in GAN architecture, the sensitive data is
only fed into the discriminator and directly impacts on
the discriminator weights and gradients. In the training
procedure, this gradient information propagates from the
discriminator back to the generator. Therefore, to guarantee
the differential privacy, the contribution of each individual
record in the model parameters should be limited. For this
reason, the impact of each individual on the parameters in
each training step is limited and by using composability
property of differential privacy, each record’s effect on the
parameters of the final model is restricted.
Since the discriminator has access to the real data and
the generator’s access to the real data is indirect and
only through the discriminator’s parameters, to guarantee
differential privacy in each step, Gaussian noise mechanism
is applied to the discriminator parameters and the post-
processing property also guarantees differential privacy for
the generator parameters. To do this, as (3) shows, the global
sensitivity of discriminator parameters must be specified.
The main contribution of this article is to determine the Informally, Algorithm 1 highlights some privacy-
global sensitivity of the discriminator parameters at each preserving properties on sensitive real data: (1) In each step
step of training when the Adam optimizer is used. In this of training, a batch of sensitive real data is sampled and used
article, ADAM-DPGAN is built upon improved WGAN to update discriminator parameters. To preserve privacy of
[16], but it can be used with any type of GAN that uses the batch of sensitive real data, Gaussian mechanism is used
Adam optimizer. Algorithm 1 outlines ADAM-DPGAN and discriminator parameters are perturbed. Clearly, the pri-
mechanism. At each step of discriminator training, first, vacy of records of real data, that are not present in the batch,
random samples of real data and samples of synthetic data also is preserved due to the lack of effect on the updated
are used to compute gradients of discriminator parameters value of the parameters. Therefore, in each step of training,
(lines 4-6). Then, Adam optimizer is used to update the the discriminator parameters preserve the privacy of sensi-
parameter values and Gaussian noise is added based on the tive real data. (2) Since at each step of training, the generator
global sensitivity to them to guarantee differential privacy network has access to the sensitive data only through the
(line 7). Additionally, privacy accounts similar to [17] discriminator network and the discriminator preserves the
11148 M. Azadmanesh et al.

privacy of sensitive data, using the post-processing property met. The following theorem provides a new upper bound for
of differential privacy, it can be concluded that in each step global sensitivity, which is independent of Adam’s gradient
of training, the generator parameters also preserve the pri- moments estimation.
vacy of sensitive data. (3) Since at each step of training, the
discriminator and the generator preserve the privacy of sen- Theorem 4 If Adam optimizer is used in the optimization
sitive real data, using composability property of differential procedureof the discriminator network, assuming that
privacy, it can be proved that the final trained model also β1D ≤ β2D (the common setting of Adam’s hyper-
preserves privacy of sensitive data. Formal privacy analysis parameters), the global sensitivity of the discriminator
is detailed in Section 4.2. parameters in each step of the training will be αD ×
Global sensitivity ( t ) in Algorithm 1 is the main √ 1 × 1
β1 .
1−β2D 1−( √ D )
parameter that must be specified. This is due the fact β2
D
that guaranteeing differential privacy of discriminator
parameters, noise magnitude should depend on global Proof According to (12), Adam’s parameter update rule is
sensitivity of discriminator’s parameters. Therefore, the wt = wt−1 − αD √m̂t . As γ ∼
= 0, the effective step taken
v̂t +γ
following theorem computes the global sensitivity of
discriminator’s parameters in each step of training. in parameter space at time step t is t = αD √
m̂t
. Therefore,
v̂t
expanding m̂t and v̂t , the upper bound of t is formulated
 
t
Theorem 3 If Adam optimizer is used in the optimization ( 1−β1D ) β1
t−j
gj
1−β2t j =1 D
procedure of the discriminator network, the global sensitiv- as |αD √
m̂t
| = |αD 1−β1t
D
×  | =
v̂t 
t
ity of the discriminator parameters in each step of training D
( 1−β2D ) β2t−k gk2
αD ( 1−β1D ) D
will be max( αD , √
k=1
) , where αD , β1D , β2D are 
1−β2D 1−β2t 
t t−j
( 1−β1D ) β1 gj
Adam hyper-parameters in the discriminator. |αD 1−β1t
D
×  D
|
j =1 t
D
( 1−β2D ) β2t−k gk2
k=1 D
Proof As Kingma et al. [15] describe, the absolute value If β1D = 0 and β2D = 0, given that  00 = 1,
of the effective stepsize (| t |) in Adam
 update rule has 
t t−j
( 1−β1D ) β1 gj 1−β2t
two upper bounds: if ( 1 − β1D ) > 1 − β2D , | t |≤  D
is equal to one and |α D 1−β t
D
×
( 1−β1D ) j =1
t  1D
αD √ , otherwise | |≤ αD . Therefore, for any ( 1−β2D ) β2t−k gk2
1−β2D t k=1 D

neighboring databases X and X , the parameter values differ 


t ( 1−β1D ) β1
t−j
gj
αD ( 1−β1D )
 D
| is equal to αD . Assuming β2D =
by at most max( αD , √ ) in each step. j =1 
t
1−β2D ( 1−β2D ) β2t−k gk2
k=1 D

1−β2t 
t ( 1−β1D ) β1
t−j
gj
According to Theorem 3, the higher αD , the more the 0, we have |αD 1−β t D ×  D
| ≤
1D j =1 
t
parameter values will change from the previous values and ( 1−β2D ) β2t−k gk2
k=1 D
therefore, more noise will be needed to provide the privacy  
of sensitive data in each training step. On the other hand, 1−β2t  ( 1−β1D ) β1D gj
t t−j 1−β2t
|αD 1−β1t
D
× 
t−j 2
| = |αD 1−β1t
D
×
a low learning rate leads to a long training process while it D j =1 ( 1−β2D ) β2D gj D

results in lower noise at any stage of the training procedure. 1−β1D t β1
t−j 1−β2t 1−β1
( 1−β ) √ ×  D | = |αD D
× √ D ×
√ 1D is the other influential factor in the amount of noise 1−β2D t−j 1−β1t 1−β2D
1−β2D j =1 β2D D

required to provide privacy. When β1D → 0 and β2D → 1, 1−(


β
√1D )t 1−(
β
√1D )t
β2 β2
more noise is needed to provide sensitive data privacy. β1
D
| ≤ |αD × √ 1 × β
D
| ≤ |αD ×
Kingma et al. [15] assume that in most common 1−( √ D ) 1−β2D 1−( √1D )
β2 β2
D D
scenarios, m̂t and v̂t ((10) and (11)) are unbiased √ 1
× 1
|
β
estimations of the first and second moments of gradients. 1−β2D 1−( √1D )
|∼
β2
Since | √ m̂t
=| √ 2 |≤ 1, the absolute value of
E[ g] D
The first inequality is obtained according to the fact that
v̂t E[ g ]
effective stepsize t =| αD √
m̂t
|≤ αD . In non-convex the sum of the weighted squares when the weights are positive
v̂t is greater than or equal to any of the weighted squares. In
stochastic optimization, which is a practical case in deep
the last equality, the sum of the first t terms of the geometric
neural networks, Adam’s update direction could no longer
series is calculated.
 In the second inequality, we exploit the
be an unbiased estimation of the gradient moments [42].
As a result, the upper bound of Theorem 3 may not be fact that 1 − β2D < 1, ( 1 − β1D ) ≤ ( 1 − β1t D ) , and so
t
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11149

( 1−β1D )
( 1−β1t )
≤ 1. The last inequality is obtained supposing that where μ0 and μ1 are probability density functions, then the
D
 β1
mechanism F satisfies ( λ, ) -RDP for:
β1D ≤ β2D and 1 − ( √ D ) t < 1. It should be noted that
β
 2D 1
if the condition β1D ≤ β2D is not met,|αD × √ 1 × ≤ log( max{Aλ , Bλ }) (15)
β1
1−β2D λ−1
1−( √D )t
β2
D
| can be used as the effective stepsize. where Aλ  Ex∼μ0 [ ( μ μ0 λ
μ0 ) ] and Bλ  Ex∼μ1 [ ( μ1 ) ] .
1 λ
β
1−( √1D )
β2
D
Mironvo et al. [17] show that Aλ ≥ Bλ and the Aλ upper
According to Theorem 4, as in Theorem 3, the higher αD , bound should be specified to track privacy loss. For upper
the more noise is needed to provide privacy at each training bound Aλ , a stable numerical procedure and a closed-form
β1 bound are presented by them. The following theorem shows
step. Also, when β2D → 1 or √ D → 1, more noise
β2D the closed form bound of Aλ for a SGM mechanism with
is needed. In Section 5, we evaluate the proposed ADAM- the sampling rate q and Gaussian noise N( 0, σ 2 ) .
DPGAN algorithm with global sensitivities of Theorem 3
and Theorem 4, which are called ADAM-DPGAN-1 and Theorem 6 (RDP for SGM [17]) If q ≤ 1
5, σ ≥ 4,
ADAM-DPGAN-2, respectively. 1 2 2
2 σ L −ln5−2lnσ
1 < λ ≤ 1 2
2σ L − 2lnσ , and λ ≤ where
L+ln( qλ) + 1 2

4.2 Privacy analysis L = ln( 1 + 1
) applying SGM to a function with
q( λ−1)
2q 2 λ
l2 -sensitivity of one satisfies ( λ, ) -RDP where   σ2
Given the number of discriminator iterations executed
per generator iteration (Id ), the overall RDP for each
According to Theorem 1, 2 and 6, more total number of
training step of the generator can be calculated using
iterations (Ig × Id ) leads to more privacy budget. Also, for
the composition RDP (Theorem 1). Because the RDP
fixed σ , a larger sampling rate (q) leads to larger privacy
computations are transformed to DP based on Theorem 2
budget (i.e. less privacy).
and following the post-processing feature of DP, the
In practice, we use the numerical implementation of RDP
differential privacy for the generator with respect to all
account.1 In each step,  of RDP bound is calculated for
records of the training dataset is guaranteed in each training
different orders (λ values). For any given λ, the overall
step. As the generator training step iterates Ig times, the
privacy for t steps can be calculated by  × t. Then to find
overall privacy for the final trained model can be calculated
the tighter upper bound, the minimum value of  and the
by applying Theorem 1 to the RDP bound for each step
corresponding λ order are used to compute ( , δ) -DP.
and transforming it to DP based on Theorem 2. Therefore,
the main remaining challenge is to compute RDP for each
training step of the discriminator.
During each training step of the discriminator, a batch of
5 Exprimental results
size b from the real dataset is sampled and d-dimensional
In this section, the evaluation of the ADAM-DPGAN is
Gaussian noise with per coordinate covariance σn2 2t is
presented. The ADAM-DPGAN is compared to a DP-
added to discriminator parameters. If the number of all
SGD GAN method, i.e. GANObfscure [7] and also to the
training records is N, the sampling rate q = b/N
GS-WGAN [12] and the method proposed by Han et al.
adds another level of privacy protection and according
[13] in four aspects: 1) the effect of privacy level on the
to Mironov et al. [17], this mechanism is called Sample
visual image quality, 2) the effect of privacy level on the
Gaussian Mechanism (SGM). Mironvo et al. show that in
realism and diversity of the generated samples, evaluated by
the case of SGM, for assessing privacy loss, it is sufficient
Inception Score (IS) [18], Frechet Inception Distance (FID)
to use the following theorem.
[19] and Jensen-Shannon score [3], 3) the effect of privacy
level on stability of training, 4) the effect of privacy level
Theorem 5 (SGM privacy loss [17]) Assume X and X are
on the membership inference attack’s accuracy, and 5) the
two neighboring datasets and F is an SGM applied to a
runtime. Following, the experimental setting is described
function f with l2 -sensitivity of one. If we have:
and then results are discussed.
F ( X) ∼ μ0  N( 0, σ 2 ) (13)

F ( X ) ∼ μ1  (1 − q)N( 0, σ 2 ) + qN( 1, σ 2 )
1 https://github.com/tensorflow/privacy/blob/master/tensorflow privacy/
(14)
privacy/analysis/rdp accountant.py
11150 M. Azadmanesh et al.

5.1 Experimental setting the ADAM-DPGAN-1 and ADAM-DPGAN-2, the image


quality decreases when the privacy budget decreases. This
To conduct the experiments, three datasets are used: is a direct result of the fact that the variance of the noise
increases with the decrease of the privacy budget; therefore,
• MNIST,2 which consists of 70000 labeled handwritten
it has a greater impact on the parameters’ values of
digit images split into 60000 training and 10000 test
the discriminator. Comparing the “ADAM-DPGAN-1” and
samples. Each image is a 28×28 grayscale image.
“ADAM-DPGAN-2”, we infer that the quality of generated
• Fashion-MNIST.3 which comprises 70000 labeled
images in the first method is better than the second one, this
images of 10 fashion categories separated into 60000
is due the fact that the sensitivity introduced in Theorem 3 is
training and 10000 test samples. Each image is a 28×28
smaller than that of Theorem 4, so the noise variance applied
gray scale image.
to the parameters is lower.
• CelebA,4 which consists of 200000 celebrity face
Figure 4 compares the quality of the sample synthetic
images. We have selected 60000 random images which
images generated by different methods. As this figure
are center-croped and resized to 48×48.
shows, the quality of generated images by ADAM-DPGAN
In the experiments, the network architecture is similar to is better than the ones generated by GANObfscure [7],
[43]. The learning rates of the discriminator and generator, GS-WGAN [12] and Han’s method [13]. This shows that
i.e. αD and αG , are set to 5 × 10−5 . The number of our proposed method can generate higher quality images
iterations on the discriminator (Id ) and the generator (Ig ) without access to public data. This is due to the accurate
are 4 and 1 × 105 , respectively. The coefficient of gradient determination of the effect of each data sample on the
penalty λ has the value of 10 and the batch size (b) is model parameters at each step of learning procedure and
set to 64. The confidence parameter (δ) is set to 10−5 . adding appropriate noise to parameters. Therefore, the
For the MNIST and Fashion-MNIST, in both generator final outputs’ quality is not significantly affected while
and discriminator, the hyper-parameters of Adam optimizer, guaranteeing differential privacy of data samples.
β1 and β2 , are set to 0.5 and 0.9, respectively and for
CelebA dataset, in both generator and discriminator, β1 5.3 The effect of privacy level on the diversity
and β2 are set to 0.0 and 0.9, respectively. As described and realism of the generated samples
before, in all of the experiments, “ADAM-DPGAN-1” and
“ADAM-DPGAN-2” are used to refer to ADAM-DPGAN To measure the diversity and quality of the generated
when global sensitivities of Theorem 3 and Theorem 4 samples, the IS metric is used. IS can evaluate the GAN
are used, respectively. For GANObfscure [7], the training performance with respect to the labeled data. Formally, IS
dataset is split into publicly available data and private data, of the generator G is defined as [18]:
with a ratio of 2 to 98, respectively. The public data is
used for adaptive clipping. In GS-WGAN [12], centralized I S = exp( Eu∼G( z) KL( P r( w | u)  P r( w) ) ) (16)
training with one discriminator is used. In Han’s method
[13], their proposed adaptive method is used to calculate the where u is a sample generated by G, P r( w | u) is the
clipping bound of the discriminator loss. All experiments conditional label w on the generated sample
 u, P r( w)
are conducted on TensorFlow. is the marginal distribution of w (i.e., u P r( w | u) =
G( z) dz) and KL denotes KL-divergence. IS always has a
5.2 The effect of privacy level on the visual image value between 1 and the number of classes supported by the
quality pre-trained classifier, where the higher value is better.
Figure 5 shows IS in different scenarios with various
The goal of the first experiment is to assess the effect of privacy budgets.  = ∞ is related to the model without
privacy level on the synthetic image quality for different employing any privacy mechanism. As the figure shows, in
datasets. Figures 2 and 3 show the random synthetic images all scenarios, IS decreases as the privacy budget is reduced.
corresponding to different models with various privacy However, the IS for the proposed method is fairly close to
budgets.  = ∞ is related to the model without employing the IS of the improved WGAN [16] without any privacy
any privacy mechanism. As these figures show for both of constraint and it is considerably higher than GANObfscure
[7], GS-WGAN [12] and the proposed method by Han
2 http://yann.lecun.com/exdb/mnist/
et al. [13]. Comparing the ADAM-DPGAN-1 and ADAM-
3 https://github.com/zalandoresearch/fashion-mnist/
DPGAN -2, as the figure shows, the IS of ADAM-DPGAN
4 http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
-1 is slightly better than that of the ADAM-DPGAN-2. The
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11151

Fig. 2 Synthetic generated images with ADAM-DPGAN-1 for differ- budgets on Fashion-MNIST dataset. (c) Synthetic samples versus
ent datasets. (a) Synthetic samples versus different privacy budgets different privacy budgets on CelebA dataset
on MNIST dataset. (b) Synthetic samples versus different privacy
11152 M. Azadmanesh et al.

Fig. 3 Synthetic generated images with ADAM-DPGAN-2 for differ- budgets on Fashion-MNIST dataset. (c) Synthetic samples versus
ent datasets. (a) Synthetic samples versus different privacy budgets different privacy budgets on CelebA dataset
on MNIST dataset. (b) Synthetic samples versus different privacy
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11153

Fig. 4 Synthetic generated images of different methods (GANObfs- MNIST dataset. (b) Synthetic samples on Fashion-MNIST dataset. (c)
cure [7], GS-WGAN [12] and the method proposed by Han et al [13]) Synthetic samples on CelebA dataset
with  = 1, δ = 10−5 on different datasets (a) Synthetic samples on

Fig. 5 Inception scores of synthetic samples for δ = 10−5 and different privacy budgets
11154 M. Azadmanesh et al.

Fig. 6 Frechet Inception Distance of generated samples for δ = 10−5 and different privacy budgets

reason is less global sensitivity in ADAM-DPGAN-1 than is measured. Formally, Jenson Shannon divergence of these
ADAM-DPGAN-2. distributions is defined as [3]:
FID score is another metric for evaluating the GAN 1 1
S(G) = KL( p( w | u)  Bp ) + KL( Bp  p( w | u) ) (18)
performance, which captures the similarity of the generated 2 2
images to the real ones. Formally, FID is defined as [19]:
where Bp is Bernoulli distribution with parameter p = 0.5
    1 and p( w | u) is the conditional distribution of discrimina-
F I D = μt − μs 2 +tr( + −2( ) )
2 (17)
t s t s tor’s output to predict u’s label as w (real/synthetic sample
  label). The more the synthetic samples are similar to the
where xt ∼ N( μt , t ) and xs ∼ N( μs , s ) are element
real samples, the shorter the distance between the condi-
vectors of a specific layer of an Inception Network for
tional distribution and the Bernoulli distribution. Therefore,
real and generated images, respectively and tr denotes
the lower value of S( G) indicates a better generator.
trace of a matrix. A lower FID value indicates more
similarity between the real and the synthetic images, which
corresponds to the higher quality of the generated images.
Figure 6 demonstrates the FID score of different methods
on labeled data (i.e. MNIST and FASHION-MNIST).  =
∞ is related to the model without employing any privacy
mechanism. As this figure shows, the FID score increases
when the privacy budget decreases. The FID score of
the proposed method is slightly higher than that of the
Improved WGAN [16] without any privacy mechanism,
and it is considerably lower than GANObfscure [7], GS-
WGAN [12] and Han’s method [13]. The ADAM-DPGAN-
1 is slightly better than the ADAM-DPGAN-2, because
the global sensitivity in ADAM-DPGAN-1 is lower than
ADAM-DPGAN-2.
To quantitatively assess the proposed method on the
unlabeled dataset (i.e. CelebA), first, another discriminator,
D  is trained to classify real and synthetic samples. Using
the output of the discriminator, Jenson Shannon divergence
between the conditional probability of the discriminator’s Fig. 7 Jensen-Shannon score of synthetic samples on CelebA for δ =
output and a Bernoulli distribution with parameter p = 0.5 10−5 and different privacy budgets
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11155

Fig. 8 The discriminator and


generator losses during training
in different methods on MNIST
for ( = 8, δ = 10−5 ) and
( = 2, δ = 10−5 )
11156 M. Azadmanesh et al.

Fig. 9 The discriminator and


generator losses during training
in different methods on Fashion-
MNIST for ( = 8, δ = 10−5 )
and ( = 2, δ = 10−5 )
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11157

Fig. 10 The discriminator and


generator losses during training
in different methods on Fashion-
MNIST for ( = 8, δ = 10−5 )
and ( = 2, δ = 10−5 )
11158 M. Azadmanesh et al.

Fig. 11 LOGAN [4] attack performance against ADAM-DPGAN for different privacy budgets and δ = 10−5 on different datasets

Figure 7 shows the Jensen-Shannon score on the GS-WGAN [12], the variance of discriminator and gen-
CelebA dataset. As the figure shows, Jensen-Shannon score erator losses remains modest, and only small fluctuations
decreases with the increase of the privacy budget. ADAM- are observed, which is the result of min-max training. In
DPGAN-1 generates more realistic samples than the other GANObfscure [7] and Han’s method [13], the oscillations
methods and ADAM-DPGAN-2 is better than GS-WGAN in the losses are more considerable.
[12] and Han’s method [13].
5.5 The effect of privacy level on the membership
5.4 The effect of privacy level on training stability inference attack’s accuracy

To evaluate the effect of privacy loss on stability and con- Although the privacy feature of the proposed method has
vergence of GAN models, the generator and discriminator been proved in Section 4.2, it is appropriate to examine
losses are tracked during training steps and are reported the resistance of the proposed solution against attacks.
every 50 iterations on the generator (Tg ). Figures 8, 9 and 10 In this section, the resistance of the proposed method
compare the discriminator and generator losses during train- against the membership inference attacks is examined.
ing on the MNIST, Fashion-MNIST, and CelebA datasets According to [6], LOGAN discriminator-accessible attack
for all methods when the privacy budgets are set to 8 and [4] outperforms the others [5, 6], so we evaluate our
2. As seen, in ADAM-DPGAN-1, ADAM-DPGAN-2 and methods against this attack. Since the size of the training

Fig. 12 LOGAN [4] attack performance against different methods for various privacy budgets and δ = 10−5 on MNIST and Fashion-MNIST
datasets
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11159

dataset is a main factor in the success of the attack and the


attack cannot be better than a random guess for training
data size 60000, we use the training data size 4096 to train
our victim models. To evaluate the attack performance, the
Area Under the Curve Receiver Operating Characteristic
(AUCROC) is used. Figure 11 shows the success of the
attack against the proposed method for different privacy
budgets on MNIST, Fashion-MNIST, and CelebA datasets.
As this figure shows, for the MNIST and Fashion-MNIST
datasets, although the accuracy of the attack is high for the
non-private model ( = ∞, AU CROC = 0.88 and  =
∞, AU CROC = 0.68 for MNIST and Fashion-MNIST,
respectively), by applying the ADAM-DPGAN methods,
the attacker cannot do better than random guessing. For
Fig. 13 The runtime of different methods on various datasets. The
the CelebA dataset, AUCROC decreases when the privacy runtime of GANObfscure [7] on CelebA is considerably higher than
budget decreases. In ADAM-DPGAN-2, when the value of others; so it has not been demonstrated in chart
the privacy budget is 2, the success of the attack is similar
to that of a random guess, but it is not the case for ADAM-
of the CelebA dataset, because the neural network is deeper,
DPGAN-1, which confirms the fact that in non-convex
the runtime of GANObfscure [7] is dramatically higher than
stochastic optimization, the upper bound of Theorem 3 may
the other methods, so it is not comparable to others.
not be met.
Finally, Fig. 12 compares the resistance of ADAM-
DPGAN with GANObfscure [7] against LOGAN attack [4]
6 Conclusion
in the MNIST and Fashion-MNIST datasets. As the figure
shows, the attack’s accuracy is high for the non-private
In this paper, ADAM-DPGAN, a differential private GAN
model while other models are almost the same. Although the
model which generates high-quality synthetic samples, has
quality of images synthesized by ADAM-DPGAN method
been proposed. ADAM-DPGAN specifies the maximum
is better than GANObfscure [7] regarding comparisons
effect of each sensitive training record on the model
made in previous subsections, the accuracy of the LOGAN
parameters at each step of the learning procedure, when the
attack against them is almost the same.
Adam optimizer is used. Then, it adds appropriate noise
to the parameters during learning procedure to preserve
5.6 Runtime assessment
privacy whilst maintaining quality. ADAM-DPGAN privacy
analysis proves that it can guarantee (, δ)-differential
Figure 13 compares the average runtime for 10 runs of
privacy. In addition, experimental evaluations on various
different methods on various datasets. It is noteworthy
image datasets demonstrate higher performance of the
that experiments implemented with settings mentioned in
proposed method than the previous work in terms of
Section 5.1 using Tensorflow on a server with an the Intel
visual quality, realism and diversity of generated samples,
core i9 CPU, one GeForce RTX 2080 GPU, and 64 GB of
convergence of training and resistance to the membership
RAM. As the figure shows, the runtime of the proposed
inference attacks.
method is longer than GS-WGAN [12] and the method
There are several research directions for the future. The
proposed by Han et al. [13], and this is due to the fact
global sensitivity introduced in this paper, is a function of
that the proposed method provides privacy for both the
training hyper-parameters; so, future work should consider
generator and discriminator networks (as opposed to the
an in-depth analysis of the impact of the parameters to create
only generator network in the these methods). However, the
a better balance between privacy, accuracy and efficiency
execution time of the proposed method is much less than
of the model. Second, evaluation of ADAM-DPGAN on
GANObfscure [7], and this is due to the fact that computing
different variants of GAN models and non-image datasets
per-example gradients is a time-consuming computational
will be considered as future work.
operation in GANObfscure [7]. It is noteworthy that as the
efficient techniques for computing per-example gradients
in convolutional neural networks [32] have a complicated Data availability statement Publicly available datasets were analyzed
in this study. This data can be found at: MNIST dataset: http://
implementation, we have used a naive approach (i.e. yann.lecun.com/exdb/mnist/; Fashion-MNIST dataset: https://github.
changing the batch size to 1) to compute per-example com/zalandoresearch/fashion-mnist/; CelebA dataset: http://mmlab.ie.
gradients in the GANObfscure [7] method. For color images cuhk.edu.hk/projects/CelebA.html.
11160 M. Azadmanesh et al.

References 20. Fredrikson M, Jha S, RIstenpart T (2015) Model inversion attacks


that exploit confidence information and basic countermeasures.
In: 22nd ACM SIGSAC Conference on computer and communi-
1. Machanavajjhala A, Kifer D, Gehrke J, Ven-kitasubramaniam
cations security, pp 1322–1333
M (2007) l-diversity: privacy beyond k-anonymity. ACM Trans
21. Yang Z, Zhang J, Chein-Chang E, Liang Z (2019) Neural network
Knowl Discov Data (TKDD) 1(1):1–24
inversion in adversarial setting via background knowledge
2. Li N, Li T, Venkatasubramanian S (2007) t-closeness: privacy
alignment. In: 2019 ACM SIGSAC conference on computer and
beyond k-anonymity and l-diversity. In: 23rd International
communications security, pp 225–240
conference on data engineering, pp 106–115
22. Shokri R, Stronato M, Song C, Shamatikov V (2017) Membership
3. Goodfellow I, Pougget-Abadie J, Mirza M, Xu B, Warde-Farely
inference attacks against machine learning models. In: IEEE
D, Ozair S, Courvalle A, Bongio Y (2014) Generative adversarial
symposium on security and privacy (SP), pp 3–18
nets. In: 27th International conference on neural information
23. Yeom S, Giacomelli I, Fredrikson M, Jha S (2018) Privacy risk
processing systems, pp 2672–2680
in machine learning: analyzing the connection to over-fitting.
4. Hayes J, Melis L, Denerzis G, De Cristofaro E (2019) LOGAN:
In: 2018 IEEE 31st Computer security foundations symposium,
membership inference attacks against generative models. In:
pp 268–282
Privacy enhancing technologies symposium, pp 133–152
24. Sablayrolles A, Douze M, Schmid C, Ollivier Y, Jegou H (2019)
5. Hilprecht B, Harterich M, Bernau D (2019) Monte Carlo and
White-box vs Black-box: bayes optimal strategies for membership
reconstruction membership inference attacks against genera-tive
inference. In: the 36th International conference on machine
models. In: Privacy enhancing technologies symposium, pp 232–
learning, pp 1–11
249
6. Chen D, Yu N, Zhang Y, Fritz M (2020) GAN-leaks: a 25. Nasr M, Shokri R, Houmansadr A (2019) Comprehensive privacy
taxonomy of membership inference attacks against generative analysis of deep learning stand-alone and federated learning
models. In: the 2020 ACM SIGSAC conference on computer and under passive and active white-box inference attacks. In: IEEE
communications security, pp 343–362 Symposium on security and privacy, pp 739–853
7. Xu C, Ren J, Zhang D, Zhang Y, Qin Z, Ren K (2019) 26. Abadi M, Chu A, Goodfellow I, McMahan HB, Mironov I,
GANobfuscator: mitigating information leakage under GAN via Talwar K, Zhang L (2016) Deep learning with differential
differential privacy. IEEE Trans Inform Forens Secur 14(9):2358– privacy. In: 2016 ACM SIGSAC Conference on computer and
2371 communications security, pp 308–318
8. Torkzadehmahani R, Kairouz P, Paten B (2019) DP-CGAN: 27. Papernot N, Abadi M, Erlingsson U, Goodfellow I, Talwar K
differentially private synthetic data and label generation. In: (2017) Semi-supervised knowledge transfer for deep learning
IEEE/CVF Conference on computer vision and pattern recogni- from private aggregator. In: the International conference on
tion workshops (CVPRW), pp 1–8 learning representations (ICLR), pp 1–16
9. Jordon J, Yoon J, Schaar M (2019) PATE-GAN: generative 28. Papernot N, Song S, Mironov I, Raghunathan A, Talwar K,
synthetic data with differential privacy guarantees. In: Seventh Erlingsson U (2018) Scalable private learning with PATE. In:
international conference on learning representations, pp 1–21 The International conference on learning representations (ICLR),
10. Long Y, Lin S, Yang Z, Gunter CA, Li B (2019) Scal- pp 1–34
able differentially private generative student model via PATE. 29. Phan N, Wang Y, Wu X, Dou D (2016) Differential privacy
arXiv:1906.09338 preserving for deep auto-encoders: an application of human
11. Wnag B, Wu F, Long Y, Rimanic L, Zhang C, Li B behavior prediction. In: The Thirtieth AAAI conference on
(2021) DataLens: scable privacy preserving training via gradient artificial intelligence (AAAI-16), pp 1309–1316
compression and aggegation. arXiv:2103.11109 30. Mirza M, Osindero S (2014) Conditional generative adversarial
12. Chen D, Orekondy T, Fritz M (2020) GS-WGAN: a gradient- nets. arXiv:1411.1784
sanitized approach for learning differentially private generators. 31. Goodfellow I (2015) Efficient per-example gradient computations.
In: 34 Conference on neural information processing systems, arXiv:1510.01799
pp 1–18 32. Rochette G, Manoel A, Tramel EA (2020) Efficient per-
13. Han C, Xue R (2021) Differentially private GANs by adding noise example gradient computations in convolutional neural networks.
to discriminator’s loss. Comput Secur 107:1–14 arXiv:1912.06015
14. Mukherjee S, Xu Y, Trivedi A, Ferres J (2019) PrivGan: 33. Bu Z, Wang H, Long Q, Su WJ (2021) On the convergence of deep
protecting GANs from membership inference attack at low cost. learning with differential privacy. arXiv:2106.07830
arXiv:2001.00071 34. Nasr M, Shokri R, Houmansadr A (2018) Machine learning
15. Kingma DP, Ba JL (2015) ADAM: a method for stochastic opti- with membership privacy using adversarial regularization. In:
mization. In: International conference on learning representations, The ACM SIGSAC conference on computer and communications
pp 1–15 security, pp 634–646
16. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville 35. Jia J, Salem A, Backes M, Zhang Y, Gong NZ (2019) MemGuard:
AC (2017) Improved training of wasserstein GANs. In: Annual defending against black-box membership inference attacks via
conference on neural information processing systems (NIPS), adversarial examples. In: The ACM SIGSAC Conference on
pp 5767–5777 computer and communications security, pp 259–274
17. Mironov I, Talwar K, Zhang L (2019) Rényi differential privacy 36. Yang Z, Shao B, Yuan B, Chein E, Zhang F (2020) Defending
of the sampled gaussian mechanism. arXiv:1908.10530 model inversion and membership inference attacks via prediction
18. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford purification. arXiv:2005.03915
A, Chen X (2016) Improved techniques for training gans. In: 37. Dwork C, Kenthapadi K, McSherry F, Mironov I, Naor M (2006)
Advances in neural information processing systems, pp 2234– Our data, ourselves: privacy via distributed noise generation
2242 38. Dwork C, Roth A (2013) The algorithmic foundations of
19. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S differential privacy. Theor Comput Sci 9(3):211–407
(2017) GANs trained by a two time-scale update rule converge 39. Dwork C, Rothblum GN, Vadhan S (2010) Boosting and
to a local Nash equilibrium. In: Annual conference on neural differential privacy. In: Proceedings of IEEE 51st annual
information processing systems (NIPS), pp 6626–6637 symposium on foundations of computer science, pp 51–60
ADAM-DPGAN: a differential private mechanism for generative adversarial network 11161

40. Mironov I (2017) Rényi differential privacy. In: IEEE 30th Behrouz Sahgholi Ghah-
computer security foundations symposium (CSF), pp 263–275 farokhi received his B.Sc.
41. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative in Computer Engineering
adversarial networks. In: International conference on machine (2004), his M.S. in Artificial
learning, pp 214–223 Intelligence (2006), and his
42. Chen X, Liu S, Sun R, Hong M (2019) On the convergence of Ph.D. in Computer Architec-
a class of ADAM-type algorithms for non-convex optimization. ture (2011) from University
In: 7th International conference on learning representations (ICLR of Isfahan. He joined the Uni-
2019), pp 1–43 versity of Isfahan in 2011 and
43. Radford A, Metz L, Chintala S (2015) Unsupervised represen- he is now associate professor
tation learning with deep convolutional generative adver-sarial at Faculty of Computer Engi-
networks. arXiv:1511.06434 neering. His research interests
include mobile communica-
Publisher’s note Springer Nature remains neutral with regard to tions, network security, and
jurisdictional claims in published maps and institutional affiliations. artificial intelligence.

Maryam Azadmanesh is Maede Ashouri Talouki is


currently Ph.D. candidate an assistant Professor of IT
of Computer Engineering at Engineering department of the
University of Isfahan. She University of Isfahan (Iran).
received her M.Sc. in Com- She received her B.S. degree
puter Engineering from Sharif and M.S. degree in Computer
University of Technology in Engineering from the Uni-
2012. Her research interests versity of Isfahan (Iran) in
include data privacy, privacy- 2004 and 2007, respectively.
preserving machine learning, In 2012, she received her
cryptography, and network Ph.D. degree at University of
security. Isfahan in computer engineer-
ing. In 2013, she joined the
University of Isfahan (Iran).
Her research interests include
mobile networks security, user
privacy and anonymity, cryptographic protocols, distributed cryptog-
raphy protocols and network security.

You might also like