About

Hi, I am a Senior Researcher at Microsoft Research AI Frontiers. 

I received my PhD at MIT Department of EECS (Electrical Engineering & Computer Science) in 2024 Summer. My advisors were Professors Suvrit Sra and Ali Jadbabaie.

 

Google Scholar Profile

Some recent highlights:
-Preprint: 
Adam with model exponential moving average is effective for nonconvex optimization
-ICLR 2024: Linear attention is (maybe) all you need (to understand transformer optimization)
-ICML 2024: Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise 

-NeurIPS 2023: Transformers learn to implement preconditioned gradient descent for in-context learning
-NeurIPS 2023: Learning threshold neurons via edge-of-stability
-NeurIPS OTML 2023: SpecTr++: Improved transport plans for speculative decoding of large language models

Work / Visiting (during PhD):

- Google Research, New York (Summer 2021)
PhD Research Intern in Learning Theory Team
Mentors: Prateek Jain, Satyen Kale, Praneeth Netrapalli, Gil Shamir

Google Research, New York (Summer&Fall 2023)
PhD Research Intern in Speech & Language Algorithms Team
Mentors: Ziteng SunAnanda Theertha SureshAhmad Beirami

- Simons Institute, "Geometric Methods in Optimization and Sampling", Berkeley CA (Fall 2021)
Visiting Graduate Student

 

Master Thesis:
 - From Proximal Point Method to Accelerated Methods on Riemannian Manifolds