Week 3 - Post - GAN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Official Open

Visual Generative
AI Application
Generative Adversarial Networks (GAN)
Week 3
AY 24/25
S PEC IALIST D IPLOMA IN APPLIE D GE NE RATIV E AI ( S D GAI)
Official Open
Official Open
Official Open
Official Open

Objectives
• By the end of this module, learners will be able to
• Describe the concepts of Generative Adversarial Networks (GAN)
• Train GAN model to generate output
Official Open

What is Generative Adversarial Networks?


• A framework where a generative model is pitted against an
adversary:
• A discriminative model learns to determine whether a sample is
generated (“fake”) or not generated (“real”)
• A generative model generates (fake) samples
• Competition between the discriminator and the generator drives both
models to improve until the “fake” are indistinguishable from the “real”
Official Open

Game Theory and GAN


Aim: Generate samples will Aim: Able to tell that the
not be classified as fake generated samples are fake

Generator G(z) Discriminator D(x)

Solution to this game is the Nash Equilibrium: G(z) being drawn from the same
distribution as the training data and D(x) = 0.5
Official Open

Applications: StyleGAN
• Human Face Generation(StyleGAN)
• Go to StyleGAN or StyleGAN3
• Generate a few images on a human face.

Are you able to tell that the


image is “fake”?
Official Open

Applications: CycleGAN

Deep Fake: From


horse to zebras

Turning a horse video into a zebra video (by CycleGAN) (youtube.com)


Official Open

Applications: GauGAN

Translating Scribbles
into Images

How to use NVIDIA GauGAN web demo app. (youtube.com)


Official Open

Generative Adversarial Networks (GAN)


Discriminative Model
• GAN consists of a:
• Discriminative Models D(x): samples
Examine samples to determine if D(x) Real or Fake
they are real or fake

• Generative Models G(z):


Create samples intended to come
from the sample distribution of Generative Model
training data
Noise Generated
G(z)
Samples
Official Open

Generative Vs Discriminative Models


Generator The Generator strives to Discriminator
make D(G(z)) approach 1

Maximize Probability of D(x) to be 1


Generator The discriminator strives to Discriminator
sample, G(z) make D(G(z)) approach 0 Decision D(x)

Generator Discriminator
function, G function, D

Solution to this game is the


Nash Equilibrium: G(z)
Input noise (z) being drawn from the same Input Sample x from
distribution as the training training dataset
data and D(x) = 0.5
Official Open

Discriminator Function
• A discriminator is a classifier model.
• It can be implemented using a Feedforward Neural Network.
• It classifies if a given sample is real or fake.

Prob (y | x)
Samples, x Probability (Fake,
given Sample)

Discriminator D(x)
Official Open

Generator Function
• Given some simple prior distribution (e.g. Gaussian or z), the
generator gives a sample of 𝑥 drawn ideally from the distribution of
the training data.
• Generator can either be Feedforward Neural Networks or
Convolutional Neural Networks
Generator G(z)

Generated
Input, z Samples, 𝑋෠

Sampling

Prior Distribution
Official Open

Understanding Cost Function for GAN


• A binary cross entropy (BCE) function is used to capture the loss
of both the generator and the discriminator.
• The BCE cost function has two parts:
• When the label and the prediction are similar, BCE ~ 0
• When the label and the prediction are different, BCE → ∞

min max − 𝐸 log(𝑑 𝑥 ) + 𝐸 1 − log(𝑑 𝑔 𝑧 )


𝑑 𝑔 Relates to
Discriminator on Relates to Generated Samples
actual samples

Discriminator will
minimize the cost Generator will Captures the Adversarial Implementation
maximize the cost (Zero sum Game)
Official Open
Official Open

Training a GAN Model


• GANs are trained in alternating (Discriminator – Generator) fashion
• The 2 models should always have similar “skill” level, progressing
and improving at approximately similar degree.
Use SGD-like algorithm of choice
(Adam) on two minibatches
simultaneously:
• A minibatch of training examples Real, 𝑋 Output, 𝑌෠
• A minibatch of generated
samples Discriminator
Aim: Distinguish the fakes from
reals
Noise
Fakes, 𝑋෠

Generator
Aim: Make the fakes look reals
Official Open

Training a Discriminator
Both real and fake examples Discriminator
are used in training Parameters
Update

Real, 𝑋 Output, 𝑌෠ Cost

Binary Cross Entropy


Discriminator
(Cost function) with
Aim: Distinguish the fakes from
labels from real and fake
reals
Noise
Fakes, 𝑋෠

Generator
Aim: Make the fakes look reals
Official Open

Training a Generator
Generator
Parameters
Update

Noise Output, 𝑌෠ Cost


Fakes, 𝑋෠
Binary Cross Entropy
Generator Discriminator
(Cost function) with all
Aim: Make the fakes Aim: Distinguish the fakes from
labels equals to real
look reals reals

Only fake examples are used


in training
Official Open

Issues with Training GAN Models


• GANs should be trained for multi-mode output.
• Modes are peaks in distribution of features, and are typical with
real-world datasets
• Mode collapse happens when generator gets stuck in one of the
modes
Official Open

Issues with Training GAN Models


• GAN attempts to make the
generated distribution and the
real distribution looks similar.
• If the discriminator becomes
too good at distinguishing
between real and fake data
early in the training, it provides
very small gradients to the
generator. The generator then
receives insufficient information
to improve, leading to slow or
stalled learning.
Official Open

Practical 3
Generative Adversarial Networks
• Vanila GAN
• Deep Convolutional GAN
Official Open

Hand-digits Generation
• In this practical we will learn to generate a 28 X 28 grayscale image
of handwritten digits (numbered from 0 to 9).
• We will train a GAN model using the MNIST dataset
Official Open

Vanilla GAN
• Generator model: Noise vector z as input and
repeating blocks that transform and scale up
the vector to the required dimensions.
• Each block consists of a dense layer
• Leaky ReLU activation
• Batch-normalization layer.
• We simply reshape the output from the final block
to transform it into the required output image size
• The discriminator is a simple feedforward
network.
• This model takes an image as input (a real image or
the fake output from the generator) and
• Classifies it as real or fake.
Official Open

Deep Convolutional GAN


• Generator with Convolutional
Neural Network architecture.
• Noise vector as input and then
passes it through a repeating
setup of up-sampling layers,
convolutional layers, and batch
normalization layers to stabilize
the training.
• Modifications
• Use batch-norm in both the
generator and the discriminator.
• Use ReLU activation in generator
for all layers except for the
output, which uses Tanh.
• Use LeakyReLU activation in the
discriminator for all layers.

Alec Radford et. al, Unsupervised Representation learning with Deep convolutional Generative Adversarial Networks
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open

You might also like