Andrea Montanari

Stanford University, Electrical Engineering, Faculty Member

Followers

173

Following

Public Views

Interests

Uploads

Papers by Andrea Montanari

Hypothesis Testing in High-Dimensional Regression under the Gaussian Random Design Model: Asymptotic Theory

Abstract: We consider linear regression in the high-dimensional regime in which the number of obs... more Abstract: We consider linear regression in the high-dimensional regime in which the number of observations $ n $ is smaller than the number of parameters $ p $. A very successful approach in this setting uses $\ ell_1 $-penalized least squares (aka the Lasso) to search for a subset of $ s_0< n $ parameters that best explain the data, while setting the other parameters to zero. A considerable amount of work has been devoted to characterizing the estimation and model selection problems within this approach.

Download

Counter braids: a novel counter architecture for per-flow measurement

Abstract Fine-grained network measurement requires routers and switches to update large arrays of... more Abstract Fine-grained network measurement requires routers and switches to update large arrays of counters at very high link speed (eg 40 Gbps). A naive algorithm needs an infeasible amount of SRAM to store both the counters and a flow-to-counter association rule, so that arriving packets can update corresponding counters at link speed. This has made accurate per-flow measurement complex and expensive, and motivated approximate methods that detect and measure only the large flows.

Supplementary Information for: On the Spread of Innovations in Social Networks

We provide the supplementary information for the article” On the Spread of Innovations in Social ... more We provide the supplementary information for the article” On the Spread of Innovations in Social Networks”, submitted to Proceedings of the National Academy of Sciences. This document mainly contains the proofs of lemmas and theorems stated in the paper. In the end, we also give a short comparison between the results of this paper and a few well-known results in economics.

Download

Two lectures on iterative coding and statistical mechanics

Page 1. arXiv:cond-mat/0512296v1 [cond-mat.stat-mech] 14 Dec 2005 Two Lectures on Iterative Codin... more Page 1. arXiv:cond-mat/0512296v1 [cond-mat.stat-mech] 14 Dec 2005 Two Lectures on Iterative Coding and Statistical Mechanics Andrea Montanari Laboratoire de Physique Théorique de l'Ecole Normale Supérieure∗, 24, rue Lhomond, 75231 Paris CEDEX 05, France (Dated: February 2, 2008) These are the notes for two lectures delivered at the Les Houches summer school Mathematical Statistical Mechanics, held in July 2005.

Identifying users from their rating patterns

Abstract This paper reports on our analysis of the 2011 CAMRa Challenge dataset (Track 2) for con... more Abstract This paper reports on our analysis of the 2011 CAMRa Challenge dataset (Track 2) for context-aware movie recommendation systems. The train dataset comprises 4 536 891 ratings provided by 171 670 users on 23 974 movies, as well as the household groupings of a subset of the users. The test dataset comprises 5 450 ratings for which the user label is missing, but the household label is provided. The challenge required to identify the user labels for the ratings in the test set.

Download

The replica symmetric solution for Potts models on d-regular graphs

Abstract: We provide an explicit formula for the limiting free energy density (log-partition func... more Abstract: We provide an explicit formula for the limiting free energy density (log-partition function divided by the number of vertices) for ferromagnetic Potts models on uniformly sparse graph sequences converging locally to the d-regular tree for d even, covering all temperature regimes. This formula coincides with the Bethe free energy functional evaluated at a suitable fixed point of the belief propagation recursion on the d-regular tree, the so-called replica symmetric solution.

Download

Solving constraint satisfaction problems through belief propagation-guided decimation

Abstract: Message passing algorithms have proved surprisingly successful in solving hard constrai... more Abstract: Message passing algorithms have proved surprisingly successful in solving hard constraint satisfaction problems on sparse random graphs. In such applications, variables are fixed sequentially to satisfy the constraints. Message passing is run after each step. Its outcome provides an heuristic to make choices at next step. This approach has been referred to asdecimation,'with reference to analogous procedures in statistical physics.

Coding for network coding

Abstract: We consider communication over a noisy network under randomized linear network coding. ... more Abstract: We consider communication over a noisy network under randomized linear network coding. Possible error mechanism include node-or link-failures, Byzantine behavior of nodes, or an over-estimate of the network min-cut. Building on the work of Koetter and Kschischang, we introduce a probabilistic model for errors. We compute the capacity of this channel and we define an error-correction scheme based on random sparse graphs and a low-complexity decoding algorithm.

Gibbs measures and phase transitions on sparse random graphs

Abstract Many problems of interest in computer science and information theory can be phrased in t... more Abstract Many problems of interest in computer science and information theory can be phrased in terms of a probability distribution over discrete variables associated to the vertices of a large (but finite) sparse graph. In recent years, considerable progress has been achieved by viewing these distributions as Gibbs measures and applying to their study heuristic tools from statistical physics. We review this approach and provide some results towards a rigorous treatment of these problems.

Which graphical models are difficult to learn?

Abstract: We consider the problem of learning the structure of Ising models (pairwise binary Mark... more Abstract: We consider the problem of learning the structure of Ising models (pairwise binary Markov random fields) from iid samples. While several methods have been proposed to accomplish this task, their relative merits and limitations remain somewhat obscure. By analyzing a number of concrete examples, we show that low-complexity algorithms systematically fail when the Markov random field develops long-range correlations.

The slope scaling parameter for general channels, decoders, and ensembles

Abstract Scaling laws are a powerful way to analyze the performance of moderately sized iterative... more Abstract Scaling laws are a powerful way to analyze the performance of moderately sized iteratively decoded sparse graph codes. Our aim is to provide an easily usable finite-length optimization tool that is applicable to the wide variety of channels, blocklengths, error probability requirements, and decoders that one encounters for practical systems.

Computing the threshold shift for general channels

Abstract The dasiathresholdpsila of a code ensemble can be defined as the noise level at which th... more Abstract The dasiathresholdpsila of a code ensemble can be defined as the noise level at which the block error probability curve crosses 1/2. For ensembles of low-density parity check codes used over the binary erasure channel, the behavior of the threshold for large blocklengths is known in detail. It is characterized by an asymptotic threshold value, and a finite-blocklength shift parameter.

Finite-length scaling for iteratively decoded LDPC ensembles

Abstract: In this paper we investigate the behavior of iteratively decoded low-density parity-che... more Abstract: In this paper we investigate the behavior of iteratively decoded low-density parity-check codes over the binary erasure channel in the so-called``waterfall region." We show that the performance curves in this region follow a very basic scaling law. We conjecture that essentially the same scaling behavior applies in a much more general setting and we provide some empirical evidence to support this conjecture.

Applications of the Lindeberg principle in communications and statistical learning

Abstract We use a generalization of the Lindeberg principle developed by S. Chatterjee to prove u... more Abstract We use a generalization of the Lindeberg principle developed by S. Chatterjee to prove universality properties for various problems in communications, statistical learning and random matrix theory. We also show that these systems can be viewed as the limiting case of a properly defined sparse system. The latter result is useful when the sparse systems are easier to analyze than their dense counterparts. The list of problems we consider is by no means exhaustive.

Generating random graphs with large girth

Abstract We present a simple and efficient algorithm for randomly generating simple graphs withou... more Abstract We present a simple and efficient algorithm for randomly generating simple graphs without small cycles. These graphs can be used to design high performance Low-Density Parity-Check (LDPC) codes. For any constant k, α≤ 1/2k (k+ 3) and m= O (n 1+ α), our algorithm generates an asymptotically uniform random graph with n vertices, m edges, and girth larger than k in polynomial time. To the best of our knowledge this is the first polynomial algorithm for the problem.

Learning networks of stochastic differential equations

Abstract: We consider linear models for stochastic dynamics. To any such model can be associated ... more Abstract: We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $ T $.

Information theoretic limits on learning stochastic differential equations

Abstract Consider the problem of learning the drift coefficient of a stochastic differential equa... more Abstract Consider the problem of learning the drift coefficient of a stochastic differential equation from a sample path. In this paper, we assume that the drift is parametrized by a high-dimensional vector. We address the question of how long the system needs to be observed in order to learn this vector of parameters. We prove a general lower bound on this time complexity by using a characterization of mutual information as time integral of conditional variance, due to Kadota, Zakai, and Ziv.

Life above threshold: From list decoding to area theorem and MSE

Abstract: We consider communication over memoryless channels using low-density parity-check code ... more Abstract: We consider communication over memoryless channels using low-density parity-check code ensembles above the iterative (belief propagation) threshold. What is the computational complexity of decoding (ie, of reconstructing all the typical input codewords for a given channel output) in this regime? We define an algorithm accomplishing this task and analyze its typical performance. The behavior of the new algorithm can be expressed in purely information-theoretical terms.

Matrix completion from a few entries

Abstract Let M be an n¿× n matrix of rank r, and assume that a uniformly random subset E of its e... more Abstract Let M be an n¿× n matrix of rank r, and assume that a uniformly random subset E of its entries is observed. We describe an efficient algorithm, which we call OptSpace, that reconstructs M from| E|= O (rn) observed entries with relative root mean square error 1/2 RMSE¿ C (¿)(nr/| E|) 1/2 with probability larger than 1-1/n 3. Further, if r= O (1) and M is sufficiently unstructured, then OptSpace reconstructs it exactly from| E|= O (n log n) entries with probability larger than 1-1/n 3.

Detailed network measurements using sparse graph counters: The theory

Abstract: Measuring network flow sizes is important for tasks like accounting/billing, network fo... more Abstract: Measuring network flow sizes is important for tasks like accounting/billing, network forensics and security. Per-flow accounting is considered hard because it requires that many counters be updated at a very high speed; however, the large fast memories needed for storing the counters are prohibitively expensive. Therefore, current approaches aim to obtain approximate flow counts; that is, to detect large elephant flows and then measure their sizes.