Learning discrete Bayesian networks with hierarchical Dirichlet shrinkage

arXiv:2509.13267v2 Announce Type: replace-cross Abstract: A discrete Bayesian network is a directed acyclic graph (DAG) consisting of categorical variables. Two popular approaches for DBN modeling include classification and nonparametric methods. However, both methods often require a large number of parameters, such as high-order interactions in the former and cell probabilities in the latter. In this article, we propose a hierarchical model for node-parent conditional probabilities, inducing shrinkage to low-dimensional latent parameters aposteriori. We generate samples from the posterior distribution of these latent variables using the Metropolis-adjusted Langevin algorithm within a Gibbs sampler. Moreover, we verify that the full conditional distribution is log-concave under mild conditions, facilitating efficient sampling. We then detail several algorithms for structure learning that incorporate our hierarchical prior and preserve the DAG property. Through simulations, we evaluate the performance of our method for sparse counts, discovering graph structure, and selecting between competing DAGs. We conclude with an application to uncovering prognostic network structure from a breast cancer dataset.

Leave a Comment