Skip to main content

Thang Bui

Followers

10

Following

10

Public Views

Interests

Uploads

Papers by Thang Bui

$Research paper thumbnail of Black-box $\alpha$-divergence Minimization$

Black-box $\alpha$-divergence Minimization

Cornell University - arXiv, Nov 10, 2015

Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-diver... more Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-divergences. BB-α scales to large datasets because it can be implemented using stochastic gradient descent. BB-α can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter α, the method is able to interpolate between variational Bayes (VB) (α → 0) and an algorithm similar to expectation propagation (EP) (α = 1). Experiments on probit regression and neural network regression and classification problems show that BB-α with non-standard settings of α, such as α = 0.5, usually produces better predictions than with α → 0 (VB) or α = 1 (EP).

Variational Auto-Regressive Gaussian Processes for Continual Learning

Cornell University - arXiv, Jun 9, 2020

Through sequential construction of posteriors on observing data online, Bayes' theorem provides a... more Through sequential construction of posteriors on observing data online, Bayes' theorem provides a natural framework for continual learning. We develop Variational Auto-Regressive Gaussian Processes (VAR-GPs), a principled posterior updating mechanism to solve sequential tasks in continual learning. By relying on sparse inducing point approximations for scalable posteriors, we propose a novel auto-regressive variational distribution which reveals two fruitful connections to existing results in Bayesian inference, expectation propagation and orthogonal inducing points. Mean predictive entropy estimates show VAR-GPs prevent catastrophic forgetting, which is empirically supported by strong performance on modern continual learning benchmarks against competitive baselines. A thorough ablation study demonstrates the efficacy of our modeling choices.

Annealed Importance Sampling with q-Paths

Cornell University - arXiv, Dec 14, 2020

Annealed Importance Sampling (AIS) [27, 18] is the gold standard for estimating partition functio... more Annealed Importance Sampling (AIS) [27, 18] is the gold standard for estimating partition functions or marginal likelihoods, corresponding to importance sampling over a path of distributions between a tractable base and an unnormalized target. While AIS yields an unbiased estimator for any path, existing literature has been primarily limited to the geometric mixture or moment-averaged paths associated with the exponential family and KL divergence [13]. We explore AIS using qpaths, which include the geometric path as a special case and are related to the homogeneous power mean, deformed exponential family, and α-divergence [3].

Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Cornell University - arXiv, Feb 24, 2022

The proliferation of computing devices has brought about an opportunity to deploy machine learnin... more The proliferation of computing devices has brought about an opportunity to deploy machine learning models on new problem domains using previously inaccessible data. Traditional algorithms for training such models often require data to be stored on a single machine with compute performed by a single node, making them unsuitable for decentralised training on multiple devices. This deficiency has motivated the development of federated learning algorithms, which allow multiple data owners to train collaboratively and use a shared model whilst keeping local data private. However, many of these algorithms focus on obtaining point estimates of model parameters, rather than probabilistic estimates capable of capturing model uncertainty, which is essential in many applications. Variational inference (VI) has become the method of choice for fitting many modern probabilistic models. In this paper we introduce partitioned variational inference (PVI), a general framework for performing VI in the federated setting. We develop new supporting theory for PVI, demonstrating a number of properties that make it an attractive choice for practitioners; use PVI to unify a wealth of fragmented, yet related literature; and provide empirical results that showcase the effectiveness of PVI in a variety of federated settings.

q-Paths: Generalizing the Geometric Annealing Path using Power Means

Cornell University - arXiv, Jul 1, 2021

Many common machine learning methods involve the geometric annealing path, a sequence of intermed... more Many common machine learning methods involve the geometric annealing path, a sequence of intermediate densities between two distributions of interest constructed using the geometric average. While alternatives such as the moment-averaging path have demonstrated performance gains in some settings, their practical applicability remains limited by exponential family endpoint assumptions and a lack of closed form energy function. In this work, we introduce q-paths, a family of paths which is derived from a generalized notion of the mean, includes the geometric and arithmetic mixtures as special cases, and admits a simple closed form involving the deformed logarithm function from nonextensive thermodynamics. Following previous analysis of the geometric path, we interpret our q-paths as corresponding to a qexponential family of distributions, and provide a variational representation of intermediate densities as minimizing a mixture of α-divergences to the endpoints. We show that small deviations away from the geometric path yield empirical gains for Bayesian inference using Sequential Monte Carlo and generative model evaluation using Annealed Importance Sampling.

Efficient and Extensible Policy Mining for Relationship-Based Access Control

Proceedings of the 24th ACM Symposium on Access Control Models and Technologies

Relationship-based access control (ReBAC) is a flexible and expressive framework that allows poli... more Relationship-based access control (ReBAC) is a flexible and expressive framework that allows policies to be expressed in terms of chains of relationship between entities as well as attributes of entities. ReBAC policy mining algorithms have a potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. Existing ReBAC policy mining algorithms support a policy language with a limited set of operators; this limits their applicability. This paper presents a ReBAC policy mining algorithm designed to be both (1) easily extensible (to support additional policy language features) and (2) scalable. The algorithm is based on Bui et al.'s evolutionary algorithm for ReBAC policy mining algorithm. First, we simplify their algorithm, in order to make it easier to extend and provide a methodology that extends it to handle new policy language features. However, extending the policy language increases the search space of candidate policies explored by the evolutionary algorithm, thus causes longer running time and/or worse results. To address the problem, we enhance the algorithm with a feature selection phase. The enhancement utilizes a neural network to identify useful features. We use the result of feature selection to reduce the evolutionary algorithm's search space. The new algorithm is easy to extend and, as shown by our experiments, is more efficient and produces better policies.

Biases in Variational Bayesian Neural Networks

Variational inference recently became the de facto standard method for approximate Bayesian neura... more Variational inference recently became the de facto standard method for approximate Bayesian neural networks. However, the standard mean-field approach (MFVI) possesses many undesirable behaviours. This short paper empirically investigates the variational biases [Turner and Sahani, 2011] of MFVI and other variational families. The preliminary results shed light on the poor performance of many variational approaches for model selection.

Learning Attribute-Based and Relationship-Based Access Control Policies with Unknown Values

Information Systems Security, 2020

Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a hig... more Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a high level of expressiveness and flexibility that promote security and information sharing, by allowing policies to be expressed in terms of attributes of and chains of relationships between entities. Algorithms for learning ABAC and ReBAC policies from legacy access control information have the potential to significantly reduce the cost of migration to ABAC or ReBAC. This paper presents the first algorithms for mining ABAC and ReBAC policies from access control lists (ACLs) and incomplete information about entities, where the values of some attributes of some entities are unknown. We show that the core of this problem can be viewed as learning a concise three-valued logic formula from a set of labeled feature vectors containing unknowns, and we give the first algorithm (to the best of our knowledge) for that problem.

Abstractions are partitioned based on state-action value functions

Stochastic Expectation Propagation for Large Scale Gaussian Process Classification

arXiv: Machine Learning, 2018

A method for large scale Gaussian process classification has been recently proposed based on expe... more A method for large scale Gaussian process classification has been recently proposed based on expectation propagation (EP). Such a method allows Gaussian process classifiers to be trained on very large datasets that were out of the reach of previous deployments of EP and has been shown to be competitive with related techniques based on stochastic variational inference. Nevertheless, the memory resources required scale linearly with the dataset size, unlike in variational methods. This is a severe limitation when the number of instances is very large. Here we show that this problem is avoided when stochastic EP is used to train the model.

Mining Relationship-Based Access Control Policies from Incomplete and Noisy Data

Relationship-based access control (ReBAC) extends attribute-based access control (ABAC) to allow ... more Relationship-based access control (ReBAC) extends attribute-based access control (ABAC) to allow policies to be expressed in terms of chains of relationships between entities. ReBAC policy mining algorithms have potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. This paper presents algorithms for mining ReBAC policies from information about entitlements together with information about entities. It presents the first such algorithms designed to handle incomplete information about entitlements, typically obtained from operation logs, and noise (errors) in information about entitlements. We present two algorithms: a greedy search guided by heuristics, and an evolutionary algorithm. We demonstrate the effectiveness of the algorithms on several policies, including 3 large case studies.

A new genetic approach for the traveling salesman problem

Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence

A new genetic algorithm (GA) for the Traveling Salesman Problem (TSP) is given. Two novel feature... more A new genetic algorithm (GA) for the Traveling Salesman Problem (TSP) is given. Two novel features of this algorithm are: (i) a new locus-based encoding/crossover pair, and (ii) a static preprocessing step which changes the e " g order of the vertices. It is believed that this algorithm is also applicable to other ordering problems, not just TSP. Experimental results on the standard benchmarks for TSP are favorable.

Neural Graph Machines: Learning Neural Networks Using Graphs

ArXiv, 2017

Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural... more Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural network architectures, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training objective for neural networks, Neural Graph Machines, for combining the power of neural networks and label propagation. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a) allowing the network to train using labeled data as in the supervised setting, (b) biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs. The proposed method is experimentally validated on a wide range of tasks (multi- label classification on social graphs, news categorization and semantic intent classification) using different ...

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

ArXiv, 2020

Probabilistic neural networks are typically modeled with independent weight priors, which do not ... more Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class of priors would represent weights compactly, capture correlations between weights, facilitate calibrated reasoning about uncertainty, and allow inclusion of prior knowledge about the function space such as periodicity or dependence on contexts such as inputs. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit embeddings that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs. We show these models provide desirable test-time uncertainty estimates on out-of-distribution data, demonstrate case...

Learning Stationary Time Series using Gaussian Processes with Nonparametric Kernels

We introduce the Gaussian Process Convolution Model (GPCM), a two-stage non-parametric generative... more We introduce the Gaussian Process Convolution Model (GPCM), a two-stage non-parametric generative procedure to model stationary signals as the convolution between a continuous-time white-noise process and a continuous-time linear filter drawn from Gaussian process. The GPCM is a continuous-time nonparametric-window moving average process and, conditionally, is itself a Gaussian process with a nonparametric kernel defined in a probabilistic fashion. The generative model can be equivalently considered in the frequency domain, where the power spectral density of the signal is specified using a Gaussian process. One of the main contributions of the paper is to develop a novel variational free-energy approach based on inter-domain inducing variables that efficiently learns the continuous-time linear filter and infers the driving white-noise process. In turn, this scheme provides closed-form probabilistic estimates of the covariance kernel and the noise-free signal both in denoising and p...

A Framework for Fast Congestion Detection in Wireless Sensor Networks Using Clustering and Petri-Net-based Verification

Applications of Wireless Sensor Networks (WSN) in harsh conditions usually cover a vast area with... more Applications of Wireless Sensor Networks (WSN) in harsh conditions usually cover a vast area with sensors randomly deployed by an uncontrolled method, e.g. dropped by helicopters. Thus the actual topology is unpredictable and can suffer from possible congestion. We propose the FCD framework for congestion detection, based on clustering techniques combined with Petri nets modelling and verification.

Variational Continual Learning

ArXiv, 2018

This paper develops variational continual learning (VCL), a simple but general framework for cont... more This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks. The framework can successfully train both deep discriminative models and deep generative models in complex continual learning settings where existing tasks evolve over time and entirely new tasks emerge. Experimental results show that VCL outperforms state-of-the-art continual learning methods on a variety of tasks, avoiding catastrophic forgetting in a fully automatic way.

Design of Covariance Functions using Inter-Domain Inducing Variables

We introduced the Gaussian Process Convolution Model (GPCM) in [1], a timeseries model for statio... more We introduced the Gaussian Process Convolution Model (GPCM) in [1], a timeseries model for stationary signals based on the convolution between a continuoustime white-noise process and a continuous-time linear filter drawn from Gaussian process. The GPCM is, conditionally, itself a Gaussian process with a nonparametric kernel defined in a probabilistic fashion. Learning is achieved using a variational free-energy approach based on inter-domain inducing variables that summarise the (posterior) continuous-time linear filter and the driving white-noise process. However, the inter-domain transformation in [1] considers local averages of the noise process and therefore requires a large number inducing variables to represent underlying functions of complex spectra. In this paper, we develop an alternative transformation operating directly in the frequency domain, that retains the same modelling and predictive power as the original but requires fewer inducing variables and, consequently, ha...

Circular Pseudo-Point Approximations for Scaling Gaussian Processes

We introduce a new approach for performing accurate and computationally efficient posterior infer... more We introduce a new approach for performing accurate and computationally efficient posterior inference for Gaussian Process regression problems that exploits the combination of pseudo-point approximations and approximately circulant covariance structure. We argue mathematically that the new technique has substantially lower asymptotic complexity than traditional pseudo-point approximations and demonstrate empirically that it returns results that are very close to those obtained using exact inference.

$Research paper thumbnail of Black-box $\alpha$-divergence Minimization$

Black-box $\alpha$-divergence Minimization

Cornell University - arXiv, Nov 10, 2015

Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-diver... more Black-box alpha (BB-α) is a new approximate inference method based on the minimization of α-divergences. BB-α scales to large datasets because it can be implemented using stochastic gradient descent. BB-α can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter α, the method is able to interpolate between variational Bayes (VB) (α → 0) and an algorithm similar to expectation propagation (EP) (α = 1). Experiments on probit regression and neural network regression and classification problems show that BB-α with non-standard settings of α, such as α = 0.5, usually produces better predictions than with α → 0 (VB) or α = 1 (EP).

Variational Auto-Regressive Gaussian Processes for Continual Learning

Cornell University - arXiv, Jun 9, 2020

Through sequential construction of posteriors on observing data online, Bayes' theorem provides a... more Through sequential construction of posteriors on observing data online, Bayes' theorem provides a natural framework for continual learning. We develop Variational Auto-Regressive Gaussian Processes (VAR-GPs), a principled posterior updating mechanism to solve sequential tasks in continual learning. By relying on sparse inducing point approximations for scalable posteriors, we propose a novel auto-regressive variational distribution which reveals two fruitful connections to existing results in Bayesian inference, expectation propagation and orthogonal inducing points. Mean predictive entropy estimates show VAR-GPs prevent catastrophic forgetting, which is empirically supported by strong performance on modern continual learning benchmarks against competitive baselines. A thorough ablation study demonstrates the efficacy of our modeling choices.

Annealed Importance Sampling with q-Paths

Cornell University - arXiv, Dec 14, 2020

Annealed Importance Sampling (AIS) [27, 18] is the gold standard for estimating partition functio... more Annealed Importance Sampling (AIS) [27, 18] is the gold standard for estimating partition functions or marginal likelihoods, corresponding to importance sampling over a path of distributions between a tractable base and an unnormalized target. While AIS yields an unbiased estimator for any path, existing literature has been primarily limited to the geometric mixture or moment-averaged paths associated with the exponential family and KL divergence [13]. We explore AIS using qpaths, which include the geometric path as a special case and are related to the homogeneous power mean, deformed exponential family, and α-divergence [3].

Partitioned Variational Inference: A Framework for Probabilistic Federated Learning

Cornell University - arXiv, Feb 24, 2022

The proliferation of computing devices has brought about an opportunity to deploy machine learnin... more The proliferation of computing devices has brought about an opportunity to deploy machine learning models on new problem domains using previously inaccessible data. Traditional algorithms for training such models often require data to be stored on a single machine with compute performed by a single node, making them unsuitable for decentralised training on multiple devices. This deficiency has motivated the development of federated learning algorithms, which allow multiple data owners to train collaboratively and use a shared model whilst keeping local data private. However, many of these algorithms focus on obtaining point estimates of model parameters, rather than probabilistic estimates capable of capturing model uncertainty, which is essential in many applications. Variational inference (VI) has become the method of choice for fitting many modern probabilistic models. In this paper we introduce partitioned variational inference (PVI), a general framework for performing VI in the federated setting. We develop new supporting theory for PVI, demonstrating a number of properties that make it an attractive choice for practitioners; use PVI to unify a wealth of fragmented, yet related literature; and provide empirical results that showcase the effectiveness of PVI in a variety of federated settings.

q-Paths: Generalizing the Geometric Annealing Path using Power Means

Cornell University - arXiv, Jul 1, 2021

Many common machine learning methods involve the geometric annealing path, a sequence of intermed... more Many common machine learning methods involve the geometric annealing path, a sequence of intermediate densities between two distributions of interest constructed using the geometric average. While alternatives such as the moment-averaging path have demonstrated performance gains in some settings, their practical applicability remains limited by exponential family endpoint assumptions and a lack of closed form energy function. In this work, we introduce q-paths, a family of paths which is derived from a generalized notion of the mean, includes the geometric and arithmetic mixtures as special cases, and admits a simple closed form involving the deformed logarithm function from nonextensive thermodynamics. Following previous analysis of the geometric path, we interpret our q-paths as corresponding to a qexponential family of distributions, and provide a variational representation of intermediate densities as minimizing a mixture of α-divergences to the endpoints. We show that small deviations away from the geometric path yield empirical gains for Bayesian inference using Sequential Monte Carlo and generative model evaluation using Annealed Importance Sampling.

Efficient and Extensible Policy Mining for Relationship-Based Access Control

Proceedings of the 24th ACM Symposium on Access Control Models and Technologies

Relationship-based access control (ReBAC) is a flexible and expressive framework that allows poli... more Relationship-based access control (ReBAC) is a flexible and expressive framework that allows policies to be expressed in terms of chains of relationship between entities as well as attributes of entities. ReBAC policy mining algorithms have a potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. Existing ReBAC policy mining algorithms support a policy language with a limited set of operators; this limits their applicability. This paper presents a ReBAC policy mining algorithm designed to be both (1) easily extensible (to support additional policy language features) and (2) scalable. The algorithm is based on Bui et al.'s evolutionary algorithm for ReBAC policy mining algorithm. First, we simplify their algorithm, in order to make it easier to extend and provide a methodology that extends it to handle new policy language features. However, extending the policy language increases the search space of candidate policies explored by the evolutionary algorithm, thus causes longer running time and/or worse results. To address the problem, we enhance the algorithm with a feature selection phase. The enhancement utilizes a neural network to identify useful features. We use the result of feature selection to reduce the evolutionary algorithm's search space. The new algorithm is easy to extend and, as shown by our experiments, is more efficient and produces better policies.

Biases in Variational Bayesian Neural Networks

Variational inference recently became the de facto standard method for approximate Bayesian neura... more Variational inference recently became the de facto standard method for approximate Bayesian neural networks. However, the standard mean-field approach (MFVI) possesses many undesirable behaviours. This short paper empirically investigates the variational biases [Turner and Sahani, 2011] of MFVI and other variational families. The preliminary results shed light on the poor performance of many variational approaches for model selection.

Learning Attribute-Based and Relationship-Based Access Control Policies with Unknown Values

Information Systems Security, 2020

Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a hig... more Attribute-Based Access Control (ABAC) and Relationship-based access control (ReBAC) provide a high level of expressiveness and flexibility that promote security and information sharing, by allowing policies to be expressed in terms of attributes of and chains of relationships between entities. Algorithms for learning ABAC and ReBAC policies from legacy access control information have the potential to significantly reduce the cost of migration to ABAC or ReBAC. This paper presents the first algorithms for mining ABAC and ReBAC policies from access control lists (ACLs) and incomplete information about entities, where the values of some attributes of some entities are unknown. We show that the core of this problem can be viewed as learning a concise three-valued logic formula from a set of labeled feature vectors containing unknowns, and we give the first algorithm (to the best of our knowledge) for that problem.

Abstractions are partitioned based on state-action value functions

Stochastic Expectation Propagation for Large Scale Gaussian Process Classification

arXiv: Machine Learning, 2018

A method for large scale Gaussian process classification has been recently proposed based on expe... more A method for large scale Gaussian process classification has been recently proposed based on expectation propagation (EP). Such a method allows Gaussian process classifiers to be trained on very large datasets that were out of the reach of previous deployments of EP and has been shown to be competitive with related techniques based on stochastic variational inference. Nevertheless, the memory resources required scale linearly with the dataset size, unlike in variational methods. This is a severe limitation when the number of instances is very large. Here we show that this problem is avoided when stochastic EP is used to train the model.

Mining Relationship-Based Access Control Policies from Incomplete and Noisy Data

Relationship-based access control (ReBAC) extends attribute-based access control (ABAC) to allow ... more Relationship-based access control (ReBAC) extends attribute-based access control (ABAC) to allow policies to be expressed in terms of chains of relationships between entities. ReBAC policy mining algorithms have potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. This paper presents algorithms for mining ReBAC policies from information about entitlements together with information about entities. It presents the first such algorithms designed to handle incomplete information about entitlements, typically obtained from operation logs, and noise (errors) in information about entitlements. We present two algorithms: a greedy search guided by heuristics, and an evolutionary algorithm. We demonstrate the effectiveness of the algorithms on several policies, including 3 large case studies.

A new genetic approach for the traveling salesman problem

Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence

A new genetic algorithm (GA) for the Traveling Salesman Problem (TSP) is given. Two novel feature... more A new genetic algorithm (GA) for the Traveling Salesman Problem (TSP) is given. Two novel features of this algorithm are: (i) a new locus-based encoding/crossover pair, and (ii) a static preprocessing step which changes the e " g order of the vertices. It is believed that this algorithm is also applicable to other ordering problems, not just TSP. Experimental results on the standard benchmarks for TSP are favorable.

Neural Graph Machines: Learning Neural Networks Using Graphs

ArXiv, 2017

Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural... more Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural network architectures, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training objective for neural networks, Neural Graph Machines, for combining the power of neural networks and label propagation. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a) allowing the network to train using labeled data as in the supervised setting, (b) biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs. The proposed method is experimentally validated on a wide range of tasks (multi- label classification on social graphs, news categorization and semantic intent classification) using different ...

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

ArXiv, 2020

Probabilistic neural networks are typically modeled with independent weight priors, which do not ... more Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class of priors would represent weights compactly, capture correlations between weights, facilitate calibrated reasoning about uncertainty, and allow inclusion of prior knowledge about the function space such as periodicity or dependence on contexts such as inputs. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit embeddings that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs. We show these models provide desirable test-time uncertainty estimates on out-of-distribution data, demonstrate case...

Learning Stationary Time Series using Gaussian Processes with Nonparametric Kernels

We introduce the Gaussian Process Convolution Model (GPCM), a two-stage non-parametric generative... more We introduce the Gaussian Process Convolution Model (GPCM), a two-stage non-parametric generative procedure to model stationary signals as the convolution between a continuous-time white-noise process and a continuous-time linear filter drawn from Gaussian process. The GPCM is a continuous-time nonparametric-window moving average process and, conditionally, is itself a Gaussian process with a nonparametric kernel defined in a probabilistic fashion. The generative model can be equivalently considered in the frequency domain, where the power spectral density of the signal is specified using a Gaussian process. One of the main contributions of the paper is to develop a novel variational free-energy approach based on inter-domain inducing variables that efficiently learns the continuous-time linear filter and infers the driving white-noise process. In turn, this scheme provides closed-form probabilistic estimates of the covariance kernel and the noise-free signal both in denoising and p...

A Framework for Fast Congestion Detection in Wireless Sensor Networks Using Clustering and Petri-Net-based Verification

Applications of Wireless Sensor Networks (WSN) in harsh conditions usually cover a vast area with... more Applications of Wireless Sensor Networks (WSN) in harsh conditions usually cover a vast area with sensors randomly deployed by an uncontrolled method, e.g. dropped by helicopters. Thus the actual topology is unpredictable and can suffer from possible congestion. We propose the FCD framework for congestion detection, based on clustering techniques combined with Petri nets modelling and verification.

Variational Continual Learning

ArXiv, 2018

This paper develops variational continual learning (VCL), a simple but general framework for cont... more This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks. The framework can successfully train both deep discriminative models and deep generative models in complex continual learning settings where existing tasks evolve over time and entirely new tasks emerge. Experimental results show that VCL outperforms state-of-the-art continual learning methods on a variety of tasks, avoiding catastrophic forgetting in a fully automatic way.

Design of Covariance Functions using Inter-Domain Inducing Variables

We introduced the Gaussian Process Convolution Model (GPCM) in [1], a timeseries model for statio... more We introduced the Gaussian Process Convolution Model (GPCM) in [1], a timeseries model for stationary signals based on the convolution between a continuoustime white-noise process and a continuous-time linear filter drawn from Gaussian process. The GPCM is, conditionally, itself a Gaussian process with a nonparametric kernel defined in a probabilistic fashion. Learning is achieved using a variational free-energy approach based on inter-domain inducing variables that summarise the (posterior) continuous-time linear filter and the driving white-noise process. However, the inter-domain transformation in [1] considers local averages of the noise process and therefore requires a large number inducing variables to represent underlying functions of complex spectra. In this paper, we develop an alternative transformation operating directly in the frequency domain, that retains the same modelling and predictive power as the original but requires fewer inducing variables and, consequently, ha...

Circular Pseudo-Point Approximations for Scaling Gaussian Processes

We introduce a new approach for performing accurate and computationally efficient posterior infer... more We introduce a new approach for performing accurate and computationally efficient posterior inference for Gaussian Process regression problems that exploits the combination of pseudo-point approximations and approximately circulant covariance structure. We argue mathematically that the new technique has substantially lower asymptotic complexity than traditional pseudo-point approximations and demonstrate empirically that it returns results that are very close to those obtained using exact inference.