Hi, I am a Senior Researcher at Microsoft Research AI Frontiers.
I received my PhD at MIT Department of EECS (Electrical Engineering & Computer Science) in 2024 Summer. My advisors were Professors Suvrit Sra and Ali Jadbabaie.
Some recent highlights:
-Preprint: Adam with model exponential moving average is effective for nonconvex optimization
-ICLR 2024: Linear attention is (maybe) all you need (to understand transformer optimization)
-ICML 2024: Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
-NeurIPS 2023: Transformers learn to implement preconditioned gradient descent for in-context learning
-NeurIPS 2023: Learning threshold neurons via edge-of-stability
-NeurIPS OTML 2023: SpecTr++: Improved transport plans for speculative decoding of large language models
Work / Visiting (during PhD):
- Google Research, New York (Summer 2021)
PhD Research Intern in Learning Theory Team
Mentors: Prateek Jain, Satyen Kale, Praneeth Netrapalli, Gil Shamir
- Google Research, New York (Summer&Fall 2023)
PhD Research Intern in Speech & Language Algorithms Team
Mentors: Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami
- Simons Institute, "Geometric Methods in Optimization and Sampling", Berkeley CA (Fall 2021)
Visiting Graduate Student
Master Thesis:
- From Proximal Point Method to Accelerated Methods on Riemannian Manifolds