Scaling Up Bayesian Network Learning

Xiannian Fan

Abstract

Bayesian networks are widely used graphical models which represent uncertain relations between the random variables in a domain compactly and intuitively. The first step of applying Bayesian networks to real-word problems is typically building the network structure. Optimal structure learning via score-and-search has become an active research topic in recent years. In this context, a scoring function is used to measure the goodness of fit of a structure to given data, and the goal is to find the structure which optimizes the scoring function. The problem has been viewed as a shortest path problem, and has been shown to be NP-hard. The complexity of the structure learning limits the usage of Bayesian networks. Thus, we propose to leverage and model correlations among variables to improve the efficiency of finding optimal structures of Bayesian networks.

In particular, the shortest path problem highlights the importance of two research issues: the quality of heuristic functions for guiding the search, and the complexity of search space. This thesis introduces several techniques for addressing the issues. We present e ective approaches to reducing the search space by extracting constraints directly from data. We also propose various methods to improve heuristic functions, so as to search over the most promising part of the solution space. Empirical results show that these methods significantly improve the eciency and scalability of heuristics search-based structure learning.

Bibtex

@phdthesis{Fan16scaling,
author = {Xiannian Fan},
title = {Scaling Up Bayesian Network Learning},
school = {CUNY Graduate Center},
address = {Computer Science Program},
year = {2016}
}