Week 2 - VAE - Lesson

Official Open
Visual Generative
AI Application
Variational Autoencoder (VAE)
Week 2
AY 24/25
S PEC IALIST D IPLOMA IN APPLIE D GE NE RATIV E AI ( S D GAI)
Official Open
Official Open
Official Open
Objectives
• By the end of this module, learners will be able to
• Describe the concepts of variational autoencoder
• Train VAE model to generate output
Official Open
Data Distribution and Images

What makes a picture of a cat looks like a cat?
• Each “datapoint” (image) has thousands or millions of dimensions

(pixels).
• There are dependencies between pixels, e.g., that nearby pixels
have similar color, and are organized into objects
• Generative model needs to “capture” these dependencies.
Data
distribution
Official Open
Generative Model Formulation
Given
Data
distribution
We want to generate
new samples that are
“similar” to the given
data distribution.
Some
Generated
initial
Generative Model Data
conditions/
distribution
distribution
Official Open
Taxonomy of Generative Models

Choices
• Explicit density
Generative
Models estimation:
• explicitly define and solve
generated data distribution
Explicit Implicit • Implicit density
Density Density
estimation:
• learn model that can
Generative
sample from generated
Approximate Tractable
Adversarial Markov Chain data distribution without
Density Density
Networks explicitly defining it
Restricted Fully Visible

Variational
Boltzmann Belief Nets
Autoencoders
Machines (PixelRNNs)
Official Open
Autoencoder
• The autoencoders’ main components Latent Representation (Intuition)

• an encoder, • Learning how to write numbers does
• a latent feature representation, and not require us to learn the gray values
• a decoder. of each pixel in the input image.
• We want the autoencoder to • Instead, we extract the essential
reconstruct the input well. Still, at the information that will allow us to solve
same time, it should create a latent the problem.
representation that is useful and • Latent representation (how to write
meaningful. each number) is very useful for various
tasks such as understanding the
essential features of a dataset.
Official Open
Unsupervised Representation Learning

• Unsupervised learning approach to learn a lower dimensional
feature representation from unlabelled training data (via an
encoder)
Latent feature space usually has a lower
dimension than that of the input data Latent Feature
(Dimensional Reduction)
Encoder
Input Data
Official Open
Autoencoder to Construct Input Data
• Decoder reconstruct data from

latent features. (Reconstructed) Input Data
• Latent features captures

factors of variation in training
data. Decoder
• It is important to ensure that

the feature in latent space are Latent Feature
trained such that it can
reconstruct the original data Encoder
(via the decoder)
Input Data
Official Open
Performance Loss Function

• L2 loss function compares the differences
between the input data and the
reconstructed data.
L2 Loss Function
(Reconstructed) Input Data
No labels used.
Train such that features
can be used to Decoder
reconstruct original data
Latent Feature
Encoder
Input Data
Official Open
Introducing Variational Autoencoders

• Autoencoders are not generative
models and have issues with
choosing latent dimension.
• Variational autoencoders consider
samples from the latent space to
generate data.
(Reconstructed) Input Data
Decoder
Latent Feature
Encoder
Input Data
Diederik P. Kingma and Max Welling (2019), “An Introduction to Variational Autoencoders”, Foundations and Trends in Machine Learning
Official Open
Main idea behind Variational AutoEncoder

Some prior Distribution of
• Assume distribution of latent space latent space
representation as Gaussian.
• “Sample” from the latent space Assume
distribution based on a given
distribution parameter (mean, Gaussian Distribution
variance) - Mean
- Covariance
Sample
• Decoder generates an output based
on this sampled latent space. Samples of
Estimates
Distribution
• The encoder attempts to estimate
the parameter of the (assumed)
distribution
Encoder Decoder
Official Open
Training a Variational Autoencoder

Image
• For each minibatch of input
data, compute the forward
Encoder
Estimate mean and
pass and the covariance
backpropagation to update
the weights
Backpropagation
Forward Pass
Sampled Latent
Space
Construct image
decoder
based on sampled
parameters
Output Image
Official Open
Generation with Variational Autoencoder
Dimension 1 of Sampled Space

Sampled Latent
Space
Construct image
based on sampled
parameters
Output Image
Dimension 2 of Sampled Space

Different dimensions of z or the (sampled) latent space encodes interpretable variations of the
images (e.g. different digits of different degree of smile Vs Head pose)
King and Welling, “Autoencoding variational Bayes”, ICLR 2014

Official Open
Practical 2
Variational Autoencoders
Official Open
Official Open
Official Open
Official Open
Official Open
Official Open

Week 2 - VAE - Lesson

Uploaded by

Copyright:

Available Formats

Week 2 - VAE - Lesson

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Week 2 - VAE - Lesson

Uploaded by

Copyright:

Available Formats

Official Open

Data Distribution and Images

• Each “datapoint” (image) has thousands or millions of dimensions

Generative Model Formulation

Taxonomy of Generative Models

Restricted Fully Visible

• The autoencoders’ main components Latent Representation (Intuition)

Unsupervised Representation Learning

Autoencoder to Construct Input Data

• Decoder reconstruct data from

• Latent features captures

• It is important to ensure that

Performance Loss Function

(Reconstructed) Input Data

Introducing Variational Autoencoders

Main idea behind Variational AutoEncoder

Training a Variational Autoencoder

Generation with Variational Autoencoder

Dimension 1 of Sampled Space

Dimension 2 of Sampled Space

King and Welling, “Autoencoding variational Bayes”, ICLR 2014

You might also like