Newest 'variational' Questions

2 votes

1 answer

45 views

Posterior estimation using VAE

Using normalizing flows, we can model model's posteriors $p(\theta|D)$, by feeding Gaussian noise $z$ to the NF (parametrized with $\phi$), using the output of the NF $\theta$ as model parameters, and ...

Alberto

1,381

asked Nov 11 at 14:56

2 votes

2 answers

47 views

VAEs - Two questions regarding the posterior and prior distribution derivations

I'm deeply failing to understand the first step in the ELBO derivation in VAEs. When asking my questions I'll also try to clearly state my assumptions and perhaps some of them are wrong to begin with: ...

DrPrItay

121

asked Oct 16 at 9:00

1 vote

0 answers

59 views

How to speed up the following ELBO evaluation?

I have an estimation problem where I need to maximize the evidence lower bound: $$ \mathrm{ELBO} = -\frac{1}{2} \Bigg( \mathbb{E}_{q(\theta)} \left[ \mathrm{vec}(\mathbf{Z})^{\mathrm{H}} \mathbf{C}^{-...

CfourPiO

315

asked Sep 4 at 14:05

0 votes

0 answers

34 views

Is the inferential challenge of dense bayesian or markov networks solved after current improvements in variational inference and neural networks?

I am trying to understand more about Graphical models, and have a reasonable grasp of the basics now. One issue that recurs in a lot of the papers of the mid-2000s and even in Koller's textbook is ...

krishnab

1,582

asked Aug 19 at 21:19

1 vote

0 answers

28 views

Why do we need to marginalize when finding p(data) when latent variables are involved? (part of elbo derivation)

so confused with the derivation of elbo. In part of the derivation p(data) is intractable as it involves an integral over a high dimensional latent variable. I cant understand why the latent ...

user425635

41

asked Aug 14 at 16:03

3 votes

1 answer

54 views

When deriving ELBO to solve variational inference problems why do we know p(z) and p(x,z) but not p(x) and p(z|x)?

I am a bit lost with the derivation of ELBO because I dont understand why some distributions are known and some are unknown. I guess we know p(z) (the prior) because it was the last value of q(z) ...

user425635

41

asked Aug 12 at 15:11

1 vote

0 answers

40 views

ELBO & "backwards" KL divergence argument order

On wikipedia it says: "A simple interpretation of the KL divergence of P from Q [i.e. D_KL(P||Q)] is the expected excess surprise from using Q as a model instead of P when the actual distribution ...

profPlum

451

asked Aug 10 at 2:03

0 votes

0 answers

11 views

Setting inducing points to non trainable in Gaussian Process regression

I notice that on the GPflow tutorials for Stochastic Variational Inference they choose a certain amount of inducing points, and after that they make them not trainable. Here they set it to not ...

Francisco Javier Jara Ávila

3

asked Aug 6 at 9:21

0 votes

1 answer

291 views

Exploring vae latent space

I recently trained a AE and a VAE and used the latent variables of each for a clustering task. It seemed to work well, sensible clusters. The main reason for training the VAE was too gain more ...

Nathan Thompo

1

asked Jul 16 at 7:37

2 votes

1 answer

116 views

Why sampling from the posterior is a good estimate for the Likelihood but sampling from the prior is bad?

In Variational Autoencoders (VAE), we have: $$ \log p_\theta(x) = \log \left[ \int p_\theta(x \mid z)p(z) \, dz \right] $$ where $ p_\theta(x \mid z) = \mathcal{N}(x; \mu_\theta(z), I) $ and $ p(z) = \...

rando

328

asked Jul 5 at 0:09

2 votes

1 answer

75 views

Why is the forward process referred to as the "ground truth" in diffusion models?

I've seen in many tutorials on diffusion models refer to the distribution of the latent variables induced by the forward process as "ground truth". I wonder why. What we can actually see is ...

Daniel Mendoza

283

asked Jun 30 at 1:16

2 votes

2 answers

100 views

Why does Variational Inference work?

ELBO is a lower bound, and only matches the true likelihood when the q-distribution/encoder we choose equals to the true posterior distribution. Are there any guarantees that maximizing ELBO indeed ...

Daniel Mendoza

283

asked Jun 24 at 20:36

8 votes

2 answers

225 views

Theoretical justification for minimizing $KL(q_\phi|p)$ rather than $KL(p|q_\phi)$?

Suppose we have a true but unknown distribution $p$ over some discrete set (i.e. assume no structure or domain knowledge), and a parameterized family of distributions $q_\phi$. In general it makes ...

user56834

2,987

asked Jun 15 at 14:35

1 vote

0 answers

21 views

Getting accurate Uncertainty from MFVI?

I wanted to know if there has been any research on methods to improve the accuracy of Mean-Field Variantional Inference (which doesn't discard the mean-field approximation). Apparently it is known to ...

profPlum

451

asked Jun 10 at 16:26

0 votes

0 answers

20 views

Sampling Gauss-Bernoulli RBM

In the 2018 paper Stein Variational Gradient Descent Without Gradient the authors analyze the sampling performance of their algorithm on multiple benchmarks. One of them is sampling from a Gauss-...

HansDoe

11

asked May 23 at 13:00

0 votes

0 answers

19 views

ShapeNet VAE KL Divergence issues

I am trying to train a VAE on shapenet but I can't seem to make it work. Any help or ideas would be highly appreciated. Now the problem is whenever I apply the KL divergence loss the network seems to ...

Youssef

1

asked May 21 at 23:42

2 votes

0 answers

44 views

Posterior approximation following optimization methods

I'm trying to quantify the uncertainty in a high dimensional, and multimodal posterior space. We do not have a analytical solution for the forward model, and the forward model could be expensive to ...

Geooo

21

asked Apr 12 at 11:04

0 votes

0 answers

30 views

Variational inference question - how to get eq. 22 from eq. 21

Referring to David Blei's notes on variational inference, I wonder how to get eq. 22 from eq. 21. Also, what is $z_{-k}$ in $L_k = \int q(z_k) \mathbb{E}_{-k}[\log p(z_k | z_{-k},x)]dz_k - \int q(z_k)\...

Mark

171

asked Apr 10 at 11:55

0 votes

0 answers

23 views

Conditions of applications for coordinate ascent variational inference?

In every reference about coordinate ascent variational inference for the mean field family (Chapter 10 Of the book of C.Bishop Pattern recognition and machine learning, or the review article of Blei ...

Pierre Gloaguen

1

asked Feb 22 at 14:39

1 vote

0 answers

41 views

Using bootstrap for accurate posterior in Variational Bayes

A common well-known issue in Variational Bayes is the variance underestimation of the posterior. Some methods using "sandwich" variance have already been proposed but provide frequentist ...

Mangnier Loïc

845

asked Feb 20 at 18:50

0 votes

0 answers

142 views

VAE with linear decoder and nonlinear encoder, does this just learn a linear decomposition of the data?

There are a number of variational autoencoder(VAE) methods that have nonlinear encoders and linear decoders. The concept of using the linear decoder is to improve the interpretability (which features ...

sanK

1

asked Feb 13 at 11:03

1 vote

0 answers

50 views

Understanding Variational inference and EM in relation to each other

I have read several answers like here but, somehow I still have a few doubts. I hope to present my understanding and ask a few questions to clear my doubts EM: A maximization maximization algorithm E-...

figs_and_nuts

2,693

asked Feb 6 at 12:33

4 votes

3 answers

185 views

Justification of independence assumption for latent variables in Expectation Maximization algorithm

When deriving the ELBO/free energy in the EM algorithm, it is often done in a "general" case of observed and latent variables and then an assumption of independent (or iid) variables is ...

user246795

101

asked Jan 24 at 18:07

2 votes

1 answer

85 views

How to calculate the score of a new datapoint by a score based diffusion model(song & ermon, 2019)?

I have a pretrained score based diffusion model trained on 64X64 images. Now I want to calculate the score of a new image(of same dimension) through this pre-trained neural network. The score network ...

rajoy99

21

asked Jan 14 at 23:30

1 vote

0 answers

23 views

Using MCMC-derived posterior to design variational approximation function

I am trying to fit a hierarchical model that estimates the covariance of some parameters, using the probabilistic programming language pyro. In simulation experiments, I saw that the MCMC generates ...

David Shor

31

asked Dec 3, 2023 at 22:12

1 vote

1 answer

362 views

Understanding a beta-variational autoencoder

I'm working on a beta-variational autoencoder using car images from the Vehicle Color Recognition Dataset. At this point, I'm just exploring different architectures and values for beta. (If you're ...

KirkD_CO

1,170

asked Aug 16, 2023 at 18:05

0 votes

0 answers

44 views

Inference of Beta-Bernoulli Distribution

Assume $x_1, x_2, \cdots, x_n$ follows a $Bern(\pi_0)$, Let $y_{ik}$ follows $Beta(\alpha,\beta)$, $i\in \{1,\cdots, n\}$, and $k\in \{1,\cdots, K\}$. Let $z_k$ follows a Bernoulli Distribution with a ...

LAM_MN

1

asked Aug 7, 2023 at 1:21

1 vote

0 answers

94 views

Why can Variational Autoencoders (VAEs) approximate arbitrary distributions?

I am trying to reason to myself why is it that VAEs can approximate arbitrary probability distributions even though $q_{\phi}(z|x)$ and $p_{\theta}(x|z)$ are Gaussian. I understand that the parameters ...

Decaying Tails

21

asked Aug 6, 2023 at 4:19

1 vote

0 answers

35 views

Tree-reweighted belief propagation: optimizing edge appearances $\mu$

I am currently implementing Tree-Reweighted Belief Propagation (TRBP) to optimize edge appearances. The authors in the main manuscript of this work keep the edge appearances, represented by 𝜇, fixed [...

c.uent

115

asked Jul 31, 2023 at 22:11

5 votes

1 answer

93 views

Calculation of an optimal variational distribution for covariance parameters in a Bayesian graphical lasso model

Context: I am considering here a variational Bayesian framework where I need to calculate the optimal variational distribution for some covariance parameters. Formally the model can be expressed as: $$...

Mangnier Loïc

845

asked Jul 31, 2023 at 19:22

2 votes

1 answer

52 views

Do I need to take additional log det Jacobians for every PDF that uses the reparameterization trick?

Consider the - ELBO objective with reparameterization which is also used in VAE's:$$ \mathcal L_{\theta,\phi}(x)=\log p_\theta(X|Z)+\log p_\theta(Z) +\log q_\phi(Z) $$ The reparameterization trick ...

wd violet

777

asked Jul 6, 2023 at 2:56

1 vote

0 answers

64 views

How does Variational Autoencoder approximate the joint probability distribution?

I know that in Variational Inference the idea is to approximate the posterior P(z|x, y) and I know that Variational AutoEncoders (VAEs) use the idea of variational inference through neural network ...

Amir Jalilifard

143

asked Jun 7, 2023 at 12:03

5 votes

1 answer

219 views

derivation of coordinate ascent variational inference

From the slides of variational inference, it shows the evidence lower bound ($L$) and the derivative over a variational distribution $q(z_k)$, quoted as follows $$ L_k = \int q(z_k) E_{-k} \bigg[ \log ...

avocado

3,653

asked May 6, 2023 at 10:22

0 votes

0 answers

22 views

Understanding line in the derivation of KL divergence optimising function in Variational Bayes

I am following the derivation of Variational Bayes approach in David Blei's lecture notes, particularly equations (13 - 16). In particular, the line: $$ = E_q [\ \log_2 q(Z) ]\ - E_q \left[\ \log_2 \...

Joseph

143

asked Apr 2, 2023 at 14:12

3 votes

2 answers

460 views

Replacing the KL-divergence term in a VAE with parameter regularization

When training a VAE, one aim to optimize function $\mathcal{L}$, defined as: $$\mathcal{L}\left(\theta,\phi; \mathbf{x}^{(i)}\right) = - D_{KL}\left(q_\phi(\mathbf{z}|\mathbf{x}^{(i)}) || p_\theta(\...

Asterion

946

asked Mar 28, 2023 at 23:04

1 vote

0 answers

126 views

How to measure posterior collapse if any

Is there any theoretical work on how to measure posterior collapse? One can measure decoder output, but it is not clear if the degradation (if any) happened due to posterior collapse or due to failing ...

Pavel Podlipensky

135

asked Mar 12, 2023 at 19:38

1 vote

0 answers

113 views

In the β-TCVAE paper, can someone help with the derivation (S3) in Appendix C.1?

Paper: Isolating Sources of Disentanglement in VAEs I follow as far as, $$\mathbb{E}_{q(z)}[log[q(z)] = \mathbb{E}_{q(z, n)}[\ log\ \mathbb{E}_{n'\sim\ p(n)}[q(z|n')]\ ]$$ Subsequently, I don't ...

S R

33

asked Mar 5, 2023 at 20:51

1 vote

0 answers

172 views

Are there any methods that combine mcmc and VI?

Are there any methods that combine VI and MCMC? If it exists, why isn’t it used prominently over techniques such as NUTS or other VIs.

JJbox

11

asked Mar 5, 2023 at 18:06

2 votes

0 answers

305 views

Why is the Wasserstein distance not used in Variational Inference

I just started learning the concept of variational inference in the context of variational Autoencoder, so please excuse me if the answer is obvious. I would like to know why traditionally, KL-...

user3748950

21

asked Feb 16, 2023 at 20:37

4 votes

1 answer

887 views

Justification of the fixed variational distribution in diffusion models

Diffusion models can be regarded as latent variable models (Ho et al., 2020; Section 2), with the latents being an hierarchical chain of random variables $z_T → \dots → z_t → z_{t-1} → \dots → z_1$ (...

Dan Oneață

205

asked Jan 16, 2023 at 15:48

7 votes

2 answers

1k views

VAE : How is likelihood $p(x|z)$ defined?

Disclaimer : not a strong background in Bayesian statistics. I gather from questions such as this one and this one that in the context of VAEs, we suppose that we know the (form of the ?) prior $p(z)$ ...

Soltius

1,396

asked Jan 10, 2023 at 10:20

3 votes

2 answers

420 views

Variational inference : is evidence constant?

I'm studying variational inference (in the context of VAEs), and I'm watching this video at this time point. At this point in the video, the goal of approximating the intractable posterior $p_{\theta}(...

Soltius

1,396

asked Jan 10, 2023 at 10:11

0 votes

0 answers

74 views

Understanding Variational inference for LDA

I am trying to derive from scratch variational inference for LDA. I am following this course: https://home.cs.colorado.edu/~jbg/teaching/CSCI_5622/19a.pdf When computing $p(Z|\Theta)$ they do the ...

sam

449

asked Dec 22, 2022 at 17:06

0 votes

0 answers

520 views

What is the closed-form of the KL-Divergence between two relaxed Bernoulli distributions?

I've seen in multiple papers that use a relaxation of the Bernoulli distribution as defined in Maddison et. al (here it is referred to as Binary Concrete) and they say that a closed form solution for ...

dannybrig

1

asked Dec 17, 2022 at 23:31

1 vote

1 answer

213 views

Comparing Gibbs sampler and variational inference

I am learning about variational inference and Gibbs simpler. I am in the process of deriving variational inference on my own. In this process, I need to make a comparison with Gibbs sampler. I am ...

sam

449

asked Nov 21, 2022 at 1:01

7 votes

1 answer

4k views

What's the role of the commitment loss in VQ-VAE?

I'm reading about VQ-VAE, and trying to understand the commitment loss $\beta||z_e(x) - sg(e)||^2$, described in the following sentence: Finally, since the volume of the embedding space is ...

ihadanny

3,360

asked Nov 8, 2022 at 14:14

4 votes

3 answers

128 views

What's the difference between p(Z, X=x) and p(Z|X=x)?

I'm trying to understand variational inference, and I've found resources that mention $p(Z, X=x)$, where $Z$ is a latent random variable and $X$ is the observed random variable. (Here is one such ...

Addison

221

asked Oct 28, 2022 at 1:53

3 votes

1 answer

1k views

VQ-VAE objective - is it ELBO maximization, or minimization of the KL-divergence between the posterior and its approximation?

I'm reading two descriptions of the VQ-VAE objective: Kingma claims in page 18 that we want to maximize the ELBO, and shows that it can be written as $ELBO = logp_{\theta}(x) - KL(q_{\phi}(z|x)||p_{\...

ihadanny

3,360

asked Oct 19, 2022 at 18:34

All Questions

Related Tags