Please refer to my google scholar profile for recent pulbications:

Google Scholar Profile



  1. SpecTr++: Improved transport plans for speculative decoding of large language models
    Kwangjun Ahn, Ahmad Beirami, Ziteng Sun, Ananda Theertha Suresh

    NeurIPS 2023 Workshop on Optimal Transport and Machine Learning, Dec. 2023. 
  2. Linear attention is (maybe) all you need (to understand transformer optimization)
    Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra 

    NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning, Dec. 2023. 
    (Selected for Oral Presentation [Talk Video at MIT Seminar])
  3. How to escape sharp minima
    Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra 
    May. 2023.
  4. Graph Matrices: Norm Bounds and Applications
    Kwangjun Ahn, Dhruv Medarametla, and Aaron Potechin 
    Oct. 2020.
    [Talk Video at MIT LIDS/STAT Tea Talk 2020]

Published Works:

  1. Transformers learn to implement preconditioned gradient descent for in-context learning
    Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra 

    Neural Information Processing Systems (NeurIPS)Dec. 2023  
  2. The Crucial Role of Normalization in Sharpness-Aware Minimization
    Yan Dai, Kwangjun Ahn, Suvrit Sra

    Neural Information Processing Systems (NeurIPS)Dec. 2023
    [Talk Video at INFORMS 2023]
  3. Learning threshold neurons via edge-of-stability
    Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang 

    Neural Information Processing Systems (NeurIPS)Dec. 2023
    [Talk Video at Microsoft Research],[Talk Video at INFORMS 2023]
  4. Model Predictive Control via On-Policy Imitation Learning
    Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie 

    5th Annual Learning for Dynamics & Control Conference (L4DC), July, 2023
    (Selected for Oral Presentation [Talk Video])
  5. Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
    Haoyuan Sun, Kwangjun Ahn, Christos Thrampoulidis, Navid Azizan

    Neural Information Processing Systems (NeurIPS)Dec. 2022 
    Journal version Published in JMLR 2023
  6. Reproducibility in Optimization: Theoretical Framework and Limits
    Kwangjun Ahn, Prateek Jain, Ziwei Ji, Satyen Kale, Praneeth Netrapalli, Gil I. Shamir

    Neural Information Processing Systems (NeurIPS)Dec. 2022 
    (Selected for Oral Presentation)
  7. One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
    Youngjae Min, Kwangjun Ahn, Navid Azizan
    IEEE 61st Conference on Decision and Control (CDC), Dec. 2022  
  8. Understanding the unstable convergence of gradient descent
    Kwangjun Ahn, Jingzhao Zhang, Suvrit Sra
    Proceedings of the 39th International Conference on Machine Learning(ICML), Jul. 2022 Baltimore 
    [Talk Video at ICML]
  9. Agnostic Learnability of Halfspaces via Logistic Loss
    Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp

    Proceedings of the 39th International Conference on Machine Learning(ICML), Jul. 2022 Baltimore
  10. Understanding Nesterov's Acceleration via Proximal Point Method
    Kwangjun Ahn and Suvrit Sra
    SIAM Symposium on Simplicity in Algorithms (SOSA), Jan. 2022
  11. Efficient constrained sampling via the mirror-Langevin algorithm
    Kwangjun Ahn and Sinho Chewi
    Adavnces in Neural Information Processing Systems (NeruIPS), Dec. 2021
    [Talk Video at NeurIPS 2021[Talk Video by Sinho at Simons Institute]
  12. Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm
    Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet
    34th Annual Conference on Learning Theory (COLT), Boulder, Colorado, Aug. 2021
    [Talk video by Sinho at Simons Institute]
  13. SGD with shuffling: optimal rates without component convexity and large epoch requirements
    Kwangjun Ahn, Chulhee Yun, and Suvrit Sra 
    Advances in Neural Information Processing Systems (NeurIPS), Dec. 2020.
    (Selected for Spotlight Presentation)

    [Talk Video at NeurIPS 2020[40min Talk Video by Suvrit at OPTML]
  14. A Simpler Strong Refutation of Random  k-XOR
    Kwangjun Ahn
    International Conference on Randomization and Computation (RANDOM) 2020, Seattle, Washington, USA, Aug. 2020.
    [Talk Video at RANDOM 2020]
  15. From Nesterov's Estimate Sequence to Riemannian Acceleration
    Kwangjun Ahn and Suvrit Sra
    Annual Conference on Learning Theory (COLT), Graz, Austria, Jul. 2020
    [Talk Video at COLT 2020] [1hr Talk Video by Suvrit]
  16. Community Recovery in Hypergraphs
    Kwangjun Ahn, Kangwook Lee, and Changho Suh

    IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6561-6579, Oct. 2019.
  17. Binary Rating Estimation with Graph Side Information  
    Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, and Changho Suh
    Advances in Neural Information Processing Systems (NeurIPS),   Montreal, Canada, Dec. 2018
  18. Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 10, Oct. 2018. 
  19. Information-theoretic Limits of Subspace Clustering 
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, Jun. 2017.
  20. Community Recovery in Hypergraphs
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    The 53rd  Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, Sep. 2016

Technical Reports:

  1. Computing the Maximum Matching Width is NP-hard
    Kwangjun Ahn and Jisu Jeong
    Sep. 2017. 
  2. Riemannian Perspective on Matrix Factorization
    Kwangjun Ahn and Felipe Suarez
    Feb. 2021

Presentation Videos:

  1. Presentation on "Optimal Convergence Rate of Hamiltonian Monte Carlo for Strongly Logconcave Distributions"
    [1hr Presentation Given at Simons Institute Reading Group]
    Sep. 2021, Berkeley CA