### Conference paper

Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021

Assistant Professor of Stats & Data Science at UChicago

Aaron Schein, Anjali Nagulpally, Hanna Wallach, Patrick Flaherty

Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021

Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021

View PDF

Cite
###
Cite

**APA**

Schein, A., Nagulpally, A., Wallach, H., & Flaherty, P. (2021). Doubly Non-Central Beta Matrix Factorization for DNA Methylation Data. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI).

**Chicago/Turabian**

Schein, Aaron, Anjali Nagulpally, Hanna Wallach, and Patrick Flaherty. “Doubly Non-Central Beta Matrix Factorization for DNA Methylation Data.” In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021.

**MLA**

Schein, Aaron, et al. “Doubly Non-Central Beta Matrix Factorization for DNA Methylation Data.” Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021.

Other materials: [Code]

Abstract:We present a new non-negative matrix factorization model for (0, 1) bounded-support data based on the doubly non-central beta (DNCB) distribution, a generalization of the beta distribution. The expressiveness of the DNCB distribution is particularly useful for modeling DNA methylation datasets, which are typically highly dispersed and multi-modal; however, the model structure is sufficiently general that it can be adapted to many other domains where latent representations of (0, 1) bounded-support data are of interest. Although the DNCB distribution lacks a closed-form conjugate prior, several augmentations let us derive an efficient posterior inference algorithm composed entirely of analytic updates. Our model improves out-of-sample predictive performance on both real and synthetic DNA methylation datasets over state-of-the-art methods in bioinformatics. In addition, our model yields meaningful latent representations that accord with existing biological knowledge.