Hello! I am a final-year Statistics Ph.D. student at the Wharton school, University of Pennsylvania, gratefully advised by Edgar Dobriban. I am broadly interested in understanding large models and AI safety. Currently, my research is on LLMs, diffusion models, and adversarial attacks.
For my undergraduate education, I graduated from UC Berkeley, receiving Bachelor's degrees in Computer Science, Mathematics, and Statistics with honors. I was very fortunate to have been advised by Will Fithian and Horia Mania.
Previously, I have interned at Amazon AI, working on diffusion models for causality, and at Jane Street Capital as a trading intern.
We propose PAIR, an automated method that uses a language model to systematically generate semantic jailbreaks for other language models, often in under twenty queries. PAIR is more computationally efficient than state-of-the-art methods by many orders of magnitude and only requires black-box access.
When estimating an average under distribution shifts, folk wisdom is to use something robust like the median.
Surprisingly, under bounded (Wasserstein) shifts, we show that the sample mean remains optimal. For linear regression, we show that ordinary least squares remains optimal.