Aaron Schein


Postdoctoral Fellow at Columbia University


Curriculum vitae


aaron.schein@columbia.edu

Data Science Institute


Columbia University


New York, NY




Aaron Schein


Postdoctoral Fellow at Columbia University


Contact

Aaron Schein


Postdoctoral Fellow at Columbia University


Curriculum vitae


aaron.schein@columbia.edu

Data Science Institute


Columbia University


New York, NY




About


I am a postdoctoral fellow in the Data Science Institute at Columbia. My research develops statistical models and computational methods to analyze modern large-scale data in political science, sociology, and genetics, among other fields in the social and natural sciences.

At Columbia, I work with David Blei and Donald Green on conducting and analyzing large-scale digital field experiments to assess the causal effects of friend-to-friend organizing on voter turnout in US elections. I also work more generally with David Blei’s lab on core topics in probabilistic machine learning, approximate Bayesian inference, and causal inference, with Donald Green's lab on other get-out-the-vote field experiments, and (more recently) with Shipra Agrawal, Roxana Geambasu, and Jeannette Wing on algorithmic fairness and differential privacy.

Outside Columbia, I work with Patrick Flaherty’s lab to develop principled statistical models for (epi)genetic data of cancer, and I am also a (part-time) senior research scientist at Ocurate & PredictWise where I develop deep probabilistic models to forecast consumer behavior.

I received my PhD in Computer Science from UMass Amherst in 2019 under the guidance of Hanna Wallach in the Machine Learning for Data Science lab. My dissertation developed a family of Bayesian tensor decomposition models for high-dimensional discrete data of networks and time-series in international relations.

Prior to that, I received an MA in Linguistics and BA in Political Science also from UMass. I also interned in industry at MITRE, Google, and Microsoft.

I am on the 2021–2022 academic job market and am interested in tenure-track research positions at schools with an interdisciplinary culture. If you think I’d be a good fit in your department, please reach out!

Recent news

  • August 2021: I gave a talk at the Politics and Computational Social Science (PaCSS) conference on preliminary results from our field experiment on friend-to-friend organizing in the 2020 general election.
  • July 2021: Our paper on modeling DNA methylation with the doubly non-central beta distribution was published at UAI 2021.
  • June 2021: Our paper on modeling DNA sequencing data with hierarchical gamma processes has (finally!) been published in the Annals of Applied Statistics.
  • Nov 2020: My op-ed on the success of Biden's "virtual ground game" was published in the Financial Times.
  • Nov 2020: I discussed the election with Professors Gregory Wawro and Robert Shapiro on a panel hosted by the Columbia Undergraduate Science Journal. A recording of our conversation is available here.
  • Nov 2020: Our research was featured in a New Yorker article on "vote tripling".
  • Oct 2020: My op-ed about Biden's "gamble" on virtual friend-to-friend voter mobilization tactics appeared in Columbia News.
  • Oct 2020: I was featured in an NBC News article about Biden's ground game.
  • Sept 2020: I am co-organizing a NeurIPS 2020 workshop called "I Can't Believe It's Not Better!" that aims to highlight common failure modes in probabilistic machine learning research. See our call for papers (submission due Oct 14).
  • July 2020: My talk on the causal effect of friend-to-friend texting on voter turnout was awarded Best Oral Presentation at IC2S2 2020. Video available here.

Selected publications


Doubly Non-Central Beta Matrix Factorization for DNA Methylation Data


Aaron Schein, Anjali Nagulpally, Hanna Wallach, Patrick Flaherty


Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021


Assessing the Effects of Friend-to-Friend Texting on Turnout in the 2018 US Midterm Elections


Aaron Schein, Keyon Vafa, Dhanya Sridhar, Victor Veitch, Jeffrey Quinn, David M. Blei, James Moffet, Donald P. Green


Proceedings of the Web Conference (WWW), 2021


A Bayesian Nonparametric Model for Inferring Subclonal Populations from Structured DNA Sequencing Data


Shai He*, Aaron Schein*, Vishal Sarsani, Patrick Flaherty


Annals of Applied Statistics, vol. 15(2), 2021


Poisson-Randomized Gamma Dynamical Systems


Aaron Schein, Scott LInderman, Mingyuan Zhou, David M. Blei, Hanna M. Wallach


Advances in Neural Information Processing Systems (NeurIPS), 2019


View all