I study computer science and statistics at Harvard University. I'm interested in understanding why AI systems behave unexpectedly, especially how unintended capabilities and failure modes emerge with interaction and scale. I aim to better understand AI systems in order to develop more reliable methods for aligning them with human intent. I'm also broadly interested in multi-agent systems, drawing from reinforcement learning and evolutionary computation to study emergent behavior in open-ended environments. I love music and language, too.

I'm currently an undergraduate researcher at the Kempner Institute working on problems in AI safety and multi-agent systems with Professor Kianté Brantley and Research Fellow Aaron Walsman. Before that, I worked on multi-agent reasoning with language models under Professor Yilun Du.

I interned as a software engineer in Institutional Securities Technology at Morgan Stanley. Before that, I was a Machine Learning Engineer Intern at FADEL and a Generative AI Research Intern at The Slade Lab.

Recent Posts

What I’ve learned doing RL with JAX

8 minute read

Some of my experiences while working on mechagogue, a reinforcement learning repository with from-scratch JAX implementations of classic RL algorithms.

Research

Programmatic Representation Learning for Reward Model Debugging

Are Reward Models (RMs) used in RLHF actually rewarding what we want them to? I extend learned programmatic representations models for interpreting 'helpful' RMs, extracting opaque internal heuristics into human-readable Python functions. I identify exploitable, non-semantic biases through SHAP analysis of learned programmatic features, including verbosity and list-formatting biases causing the RM to assign higher rewards to unhelpful responses. Ongoing work: automating the bias discovery pipeline as a tool for auditing alignment systems for failure modes.

When Honest Work Becomes Impossible: Coding Agents Under Pressure

Experiments and talk for Professor Boaz Barak's graduate seminar, Topics in Foundations of ML: AI Alignment and Safety. Demonstrated how impossible tasks and threats to autonomy and capabilities can elicit evaluation hacking by coding agents. Highlighted the challenges of measuring misaligned behaviors with situational awareness as a growing concern.

The Emergence of Complex Behavior in Large-Scale Ecological Environments

In an effort to discover how complex behaviors naturally emerge, we conduct experiments in large-scale open-ended worlds that reach populations of more than 60,000 individual agents, each with their own evolved neural network policy. We examine how sensing modalities and environmental scale affect the emergence of various behaviors, finding that some appear only in sufficiently large environments and populations, with larger scales increasing behavioral stability and consistency. Our scaling results provide promising new directions to explore ecology as an instrument of machine learning in an era of abundant computational resources.

Explain This, Pruner! The Effect of Zero-Order Pruning on LLM Explainability and Curvature

An investigation of the effect of model compression on AI interpretability. Read our paper in The Harvard Undergraduate Research Journal.

Large Motion Diffusion Models

Training and evaluation of diffusion models on the AddBiomechanics dataset for generating sequences of human motion. Find our lightning talk at the 2025 Harvard Generative AI Symposium here.

Engineering

DIRT: The Distributed Intelligent Replicator Toolkit

We introduce DIRT, a GPU-accelerated simulation platform built on JAX for studying large-scale multi-agent populations in simulated ecosystems. DIRT is designed to explore the ways that intelligence in artificial agents influences the emergent population dynamics of complex environments at very large scales. To support analysis, DIRT includes integrated measurement tools and an interactive 3D viewer for fine-grained agent inspection and tracking.

Mechagogue

'Teacher of Machines,' a JAX-based machine learning framework for reinforcement learning, supervised learning, and evolutionary algorithms.

The Golden Arm

The official web app for Harvard's student-run movie theater, with a custom content management system, seat booking, archives, merch shop, and more.

SlavicGPT

Building, training, and fine-tuning of GPTs on Russian text and Slavic literature scraped from the web.

VioLibrary

A web app for searching violin recital repertoire, discovering new pieces via personalized recommendations, and building recital programs.