About

Hi, I am a PhD student at MIT at the Department of EECS (Electrical Engineering & Computer Science). I started my PhD in 2019 Fall. My advisors are Professors Suvrit Sra and Ali Jadbabaie.

Google Scholar Profile

Recently, I've been focusing on understanding/designing optimization algorithms. Here are some of my new works on understanding Adam:
-ICLR 2024: Linear attention is (maybe) all you need (to understand transformer optimization)
-Preprint: Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise

Check out my papers on understanding in-context learning and accelerating LLMs via improved speculative decoding.
-NeurIPS 2023: Transformers learn to implement preconditioned gradient descent for in-context learning
-NeurIPS OTML 2023: SpecTr++: Improved transport plans for speculative decoding of large language models

Also, check out my works on understanding optimization dynamics in deep learning:
-NeurIPS 2023: Learning threshold neurons via edge-of-stability
-NeurIPS 2023: The Crucial Role of Normalization in Sharpness-Aware Minimization

Work / Visiting (during PhD):

- Google Research, New York (Summer 2021)
PhD Research Intern in Learning Theory Team
Mentors: Prateek Jain, Satyen Kale, Praneeth Netrapalli, Gil Shamir
- Google Research, New York (Summer&Fall 2023)
PhD Research Intern in Speech & Language Algorithms Team
Mentors: Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami

- Simons Institute, "Geometric Methods in Optimization and Sampling", Berkeley CA (Fall 2021)
Visiting Graduate Student

Master Thesis:
- From Proximal Point Method to Accelerated Methods on Riemannian Manifolds