Finding Optimal Bayesian Network Structures with Constraints Learned from Data

Xiannian Fan, Brandon Malone, and Changhe Yuan

Abstract

Several recent algorithms for learning Bayesian network structures first calculate potentially optimal parent sets (POPS) for all variables and then use various optimization techniques to find a set of POPS, one for each variable, that constitutes an optimal network structure. This paper makes the observation that there is useful information implicit in the POPS. Specifically, the POPS of a variable constrain its parent candidates. Moreover, the parent candidates of all variables together give a directed cyclic graph, which often decomposes into a set of strongly connected components (SCCs). Each SCC corresponds to a smaller subproblem which can be solved independently of the others. Our results show that solving the constrained subproblems significantly improves the efficiency and scalability of heuristic search-based structure learning algorithms. Further, we show that by considering only the top p POPS of each variable, we quickly find provably very high quality networks for large datasets.

PDF

Bibtex

@INPROCEEDINGS{Fan14tightening,
author = {Xiannian Fan, Brandon Malone, and Changhe Yuan},
title = {Finding Optimal Bayesian Network Structures with Constraints Learned from Data},
booktitle = {In Proceedings of the 30th Annual Conference on Uncertainty in Artificial Intelligence (UAI-14)},
address = {Quebec City, Quebec},
year = {2014}
}