Postdoctoral Fellow at Columbia University
I am a postdoctoral fellow in the Data Science Institute at Columbia. My research develops statistical models and computational methods to analyze modern large-scale data in political science, sociology, and genetics, among other fields in the social and natural sciences.
At Columbia, I work with David Blei and Donald Green on conducting and analyzing large-scale digital field experiments to assess the causal effects of friend-to-friend organizing on voter turnout in US elections. I also work more generally with David Blei’s lab on core topics in probabilistic machine learning, approximate Bayesian inference, and causal inference, with Donald Green's lab on other get-out-the-vote field experiments, and (more recently) with Shipra Agrawal, Roxana Geambasu, and Jeannette Wing on algorithmic fairness and differential privacy.
Outside Columbia, I work with Patrick Flaherty’s lab to develop principled statistical models of (epi)genetic data for cancer research, and I am also a (part-time) senior research scientist at Ocurate & PredictWise where I develop deep probabilistic models to forecast consumer behavior.
I received my PhD in Computer Science from UMass Amherst in 2019 under the guidance of Hanna Wallach in the Machine Learning for Data Science lab. My dissertation developed a family of Bayesian tensor decomposition models for high-dimensional discrete data of networks and time-series in international relations.
Prior to that, I received an MA in Linguistics and BA in Political Science also from UMass. I also interned in industry at MITRE, Google, and Microsoft.
I am on the 2021–2022 academic job market and am interested in tenure-track research positions at schools with an interdisciplinary culture. If you think I’d be a good fit in your department, please reach out!
- Nov 2021: I will be speaking at the Columbia Marketing Department Quant seminar about our work on digital field experiments on Outvote.
- Oct 2021: Our extended abstract on results from our study in the 2020 Presidential election was accepted for a talk at the Conference on Digital Experimentation (CODE) in November!
- Aug 2021: I gave a talk at the Politics and Computational Social Science (PaCSS) conference on preliminary results from our field experiment on friend-to-friend organizing in the 2020 general election.
- July 2021: Our paper on modeling DNA methylation with the doubly non-central beta distribution was published at UAI 2021.
- June 2021: Our paper on modeling DNA sequencing data with hierarchical gamma processes has (finally!) been published in the Annals of Applied Statistics.
- Nov 2020: My op-ed on the success of Biden's "virtual ground game" was published in the Financial Times.
- Nov 2020: I discussed the election with Professors Gregory Wawro and Robert Shapiro on a panel hosted by the Columbia Undergraduate Science Journal. A recording of our conversation is available here.
- Nov 2020: Our research was featured in a New Yorker article on "vote tripling".
- Oct 2020: My op-ed about Biden's "gamble" on virtual friend-to-friend voter mobilization tactics appeared in Columbia News.
- Oct 2020: I was featured in an NBC News article about Biden's ground game.
- Sept 2020: I am co-organizing a NeurIPS 2020 workshop called "I Can't Believe It's Not Better!" that aims to highlight common failure modes in probabilistic machine learning research. See our call for papers (submission due Oct 14).
- July 2020: My talk on the causal effect of friend-to-friend texting on voter turnout was awarded Best Oral Presentation at IC2S2 2020. Video available here.
Aaron Schein, Anjali Nagulpally, Hanna Wallach, Patrick Flaherty
Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2021
Aaron Schein, David M. Blei, Donald P. Green
Conference on Digital Experimentation ([email protected]), 2021
Aaron Schein, Keyon Vafa, Dhanya Sridhar, Victor Veitch, Jeffrey Quinn, David M. Blei, James Moffet, Donald P. Green
Proceedings of the Web Conference (WWW), 2021
Shai He*, Aaron Schein*, Vishal Sarsani, Patrick Flaherty
Annals of Applied Statistics, vol. 15(2), 2021