I’m a 5th year PhD candidate at the University of Maryland studying reinforcement learning in open-ended environments. My research centers around developing an empirical understanding of policy gradient algorithms and automatic curriculum learning methods. I was one of the developers of PettingZoo for multiagent environments and have worked on a number of open source tools for RL, including my curriculum learning library Syllabus. I’ve interned at Amazon Science, Google Research, and Sony AI working on RL and RLHF. I’m very interested in research directions that use LLMs to augment RL training in complex environments, or RL to improve the general capabilities of LLMs. Feel free to reach out if you’d like to discuss ideas or opportunities to collaborate! I’m looking for industry research scientist or postdoc positions starting in August 2025.
PhD in Computer Science, Expected 2025
University of Maryland
BSc in Computer Science, 2020
Purdue University
BSc in Applied Statistics, 2020
Purdue University
BSc in Mathematics, 2020
Purdue University
We introduce MO-ODPO, an efficient and robust algorithm for aligning large language models with multiple conflicting preferences, allowing flexible steerability at inference.
Syllabus provides a portable library and universal API for implementing curriculum learning methods in reinforcement learning across diverse environments and RL libraries.
We propose Conditional Language Policy (CLP), a framework for finetuning steerable language models that effectively balance multiple conflicting objectives without maintaining separate models.
We applied the implementation tricks introduced by DreamerV3 to PPO, and identified cases where they help or harm reward robustness.
The paper describes Neural MMO 2.0, a massive multitask update for the multiagent NeuralMMO environment.