Identifying Mixtures of Bayesian Network Distributions

by   Spencer L. Gordon, et al.

A Bayesian Network is a directed acyclic graph (DAG) on a set of n random variables (identified with the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the rv's that is Markovian on the graph. A finite mixture of such models is the projection on these variables of a BND on the larger graph which has an additional "hidden" (or "latent") random variable U, ranging in {1,…,k}, and a directed edge from U to every other vertex. Models of this type are fundamental to research in Causal Inference, where U models a confounding effect. One extremely special case has been of longstanding interest in the theory literature: the empty graph. Such a distribution is simply a mixture of k product distributions. A longstanding problem has been, given the joint distribution of a mixture of k product distributions, to identify each of the product distributions, and their mixture weights. Our results are: (1) We improve the sample complexity (and runtime) for identifying mixtures of k product distributions from exp(O(k^2)) to exp(O(k log k)). This is almost best possible in view of a known exp(Ω(k)) lower bound. (2) We give the first algorithm for the case of non-empty graphs. The complexity for a graph of maximum degree Δ is exp(O(k(Δ^2 + log k))). (The above complexities are approximate and suppress dependence on secondary parameters.)


page 1

page 2

page 3

page 4


Identifying Finite Mixtures of Nonparametric Product Distributions and Causal Inference of Confounders

We propose a kernel method to identify finite mixtures of nonparametric ...

Causal Structure Discovery from Distributions Arising from Mixtures of DAGs

We consider distributions arising from a mixture of causal models, where...

Source Identification for Mixtures of Product Distributions

We give an algorithm for source identification of a mixture of k product...

Dependency in DAG models with Hidden Variables

Directed acyclic graph models with hidden variables have been much studi...

Hadamard Powers and the Identification of Mixtures of Products

The Hadamard Power of a matrix is the matrix consisting of all Hadamard ...

On the Parameterized Complexity of Polytree Learning

A Bayesian network is a directed acyclic graph that represents statistic...

Please sign up or login with your details

Forgot password? Click here to reset