Publications

Google Scholar Profile

Preprints:

  1. Does SGD really happen in tiny subspaces?
    Minhak Song, Kwangjun Ahn, Chulhee Yun
    May. 2024.
  2. Graph Matrices: Norm Bounds and Applications
    Kwangjun Ahn, Dhruv Medarametla, and Aaron Potechin 
    Oct. 2020.
    [Talk Video 2020]

Published Works:

  1. Adam with model exponential moving average is effective for nonconvex optimization
    Kwangjun Ahn, Ashok Cutkosky
    Adavnces in Neural Information Processing Systems (NeruIPS 2024)  
  2. Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise 
    Kwangjun Ahn, Zhiyu Zhang, Yunbum Kook, Yan Dai
    International Conference on Machine Learning (ICML 2024)
  3. How to escape sharp minima with random perturbations
    Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra 
    International Conference on Machine Learning (ICML 2024)
  4. Linear attention is (maybe) all you need (to understand transformer optimization)
    Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra
    Internation Conference on Learning Representations (ICLR 2024)
    [Talk Video at MIT]
    (Also presented at NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning; selected for Oral Presenation)
  5. Transformers learn to implement preconditioned gradient descent for in-context learning
    Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra 
    Adavnces in Neural Information Processing Systems (NeruIPS 2023)  
  6. The Crucial Role of Normalization in Sharpness-Aware Minimization
    Yan Dai, Kwangjun Ahn, Suvrit Sra
    Adavnces in Neural Information Processing Systems (NeruIPS 2023)
    [Talk Video at INFORMS]
  7. Learning threshold neurons via edge-of-stability
    Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang
    Adavnces in Neural Information Processing Systems (NeruIPS 2023)
    [Talk Video at Microsoft] [Talk Video at INFORMS]
  8. SpecTr++: Improved transport plans for speculative decoding of large language models
    Kwangjun Ahn, Ahmad Beirami, Ziteng Sun, Ananda Theertha Suresh
    NeurIPS 2023 Workshop on Optimal Transport and Machine Learning, Dec. 2023.
  9. Model Predictive Control via On-Policy Imitation Learning
    Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie
    Annual Learning for Dynamics & Control Conference (L4DC 2023)
    (Selected for Oral Presentation)
  10. Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
    Haoyuan Sun, Kwangjun Ahn, Christos Thrampoulidis, Navid Azizan
    Adavnces in Neural Information Processing Systems (NeruIPS 2022) 
  11. Reproducibility in Optimization: Theoretical Framework and Limits
    Kwangjun Ahn, Prateek Jain, Ziwei Ji, Satyen Kale, Praneeth Netrapalli, Gil I. Shamir
    Adavnces in Neural Information Processing Systems (NeruIPS 2022)
    [Selected for Oral Presentation]
  12. One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
    Youngjae Min, Kwangjun Ahn, Navid Azizan
    Conference on Decision and Control (CDC 2022)  
  13. Understanding the unstable convergence of gradient descent
    Kwangjun Ahn, Jingzhao Zhang, Suvrit Sra
    International Conference on Machine Learning (ICML 2022)
  14. Agnostic Learnability of Halfspaces via Logistic Loss
    Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp
    International Conference on Machine Learning (ICML 2022)
  15. Understanding Nesterov's Acceleration via Proximal Point Method
    Kwangjun Ahn and Suvrit Sra
    SIAM Symposium on Simplicity in Algorithms (SOSA 2022)
  16. Efficient constrained sampling via the mirror-Langevin algorithm
    Kwangjun Ahn and Sinho Chewi
    Adavnces in Neural Information Processing Systems (NeruIPS 2021) 
  17. Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm
    Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet
    Annual Conference on Learning Theory (COLT 2021) 
  18. SGD with shuffling: optimal rates without component convexity and large epoch requirements
    Kwangjun Ahn, Chulhee Yun, and Suvrit Sra 
    Advances in Neural Information Processing Systems (NeurIPS 2020)
    (Selected for Spotlight Presentation)
  19. A Simpler Strong Refutation of Random  k-XOR
    Kwangjun Ahn
    International Conference on Randomization and Computation (RANDOM 2020)
  20. From Nesterov's Estimate Sequence to Riemannian Acceleration
    Kwangjun Ahn and Suvrit Sra
    Annual Conference on Learning Theory (COLT 2020)
  21. Community Recovery in Hypergraphs
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    IEEE Transcations on Information Theory 2019
  22. Binary Rating Estimation with Graph Side Information  
    Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, and Changho Suh
    Advances in Neural Information Processing Systems (NeurIPS 2018)
  23. Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    IEEE Journal of Selected Topics in Signal Processing 2018. 
  24. Information-theoretic Limits of Subspace Clustering 
    Kwangjun Ahn, Kangwook Lee, and Changho Suh
    IEEE International Symposium on Information Theory (ISIT 2017)

Technical Reports:

  1. Computing the Maximum Matching Width is NP-hard
    Kwangjun Ahn and Jisu Jeong
    Sep. 2017. 
  2. Riemannian Perspective on Matrix Factorization
    Kwangjun Ahn and Felipe Suarez
    Feb. 2021