Preprints:
- Does SGD really happen in tiny subspaces?
Minhak Song, Kwangjun Ahn, Chulhee Yun
May. 2024. - Graph Matrices: Norm Bounds and Applications
Kwangjun Ahn, Dhruv Medarametla, and Aaron Potechin
Oct. 2020.
[Talk Video 2020]
Published Works:
- Adam with model exponential moving average is effective for nonconvex optimization
Kwangjun Ahn, Ashok Cutkosky
Adavnces in Neural Information Processing Systems (NeruIPS 2024) - Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Kwangjun Ahn, Zhiyu Zhang, Yunbum Kook, Yan Dai
International Conference on Machine Learning (ICML 2024) - How to escape sharp minima with random perturbations
Kwangjun Ahn, Ali Jadbabaie, Suvrit Sra
International Conference on Machine Learning (ICML 2024) - Linear attention is (maybe) all you need (to understand transformer optimization)
Kwangjun Ahn, Xiang Cheng, Minhak Song, Chulhee Yun, Ali Jadbabaie, Suvrit Sra
Internation Conference on Learning Representations (ICLR 2024)
[Talk Video at MIT]
(Also presented at NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning; selected for Oral Presenation) - Transformers learn to implement preconditioned gradient descent for in-context learning
Kwangjun Ahn, Xiang Cheng, Hadi Daneshmand, Suvrit Sra
Adavnces in Neural Information Processing Systems (NeruIPS 2023) - The Crucial Role of Normalization in Sharpness-Aware Minimization
Yan Dai, Kwangjun Ahn, Suvrit Sra
Adavnces in Neural Information Processing Systems (NeruIPS 2023)
[Talk Video at INFORMS] - Learning threshold neurons via edge-of-stability
Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang
Adavnces in Neural Information Processing Systems (NeruIPS 2023)
[Talk Video at Microsoft] [Talk Video at INFORMS] - SpecTr++: Improved transport plans for speculative decoding of large language models
Kwangjun Ahn, Ahmad Beirami, Ziteng Sun, Ananda Theertha Suresh
NeurIPS 2023 Workshop on Optimal Transport and Machine Learning, Dec. 2023. - Model Predictive Control via On-Policy Imitation Learning
Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie
Annual Learning for Dynamics & Control Conference (L4DC 2023)
(Selected for Oral Presentation) - Mirror Descent Maximizes Generalized Margin and Can Be Implemented Efficiently
Haoyuan Sun, Kwangjun Ahn, Christos Thrampoulidis, Navid Azizan
Adavnces in Neural Information Processing Systems (NeruIPS 2022) - Reproducibility in Optimization: Theoretical Framework and Limits
Kwangjun Ahn, Prateek Jain, Ziwei Ji, Satyen Kale, Praneeth Netrapalli, Gil I. Shamir
Adavnces in Neural Information Processing Systems (NeruIPS 2022)
[Selected for Oral Presentation] - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive Least-Squares
Youngjae Min, Kwangjun Ahn, Navid Azizan
Conference on Decision and Control (CDC 2022) - Understanding the unstable convergence of gradient descent
Kwangjun Ahn, Jingzhao Zhang, Suvrit Sra
International Conference on Machine Learning (ICML 2022) - Agnostic Learnability of Halfspaces via Logistic Loss
Ziwei Ji, Kwangjun Ahn, Pranjal Awasthi, Satyen Kale, Stefani Karp
International Conference on Machine Learning (ICML 2022) - Understanding Nesterov's Acceleration via Proximal Point Method
Kwangjun Ahn and Suvrit Sra
SIAM Symposium on Simplicity in Algorithms (SOSA 2022) - Efficient constrained sampling via the mirror-Langevin algorithm
Kwangjun Ahn and Sinho Chewi
Adavnces in Neural Information Processing Systems (NeruIPS 2021) - Optimal dimension dependence of the Metropolis-Adjusted Langevin Algorithm
Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, Philippe Rigollet
Annual Conference on Learning Theory (COLT 2021) - SGD with shuffling: optimal rates without component convexity and large epoch requirements
Kwangjun Ahn, Chulhee Yun, and Suvrit Sra
Advances in Neural Information Processing Systems (NeurIPS 2020)
(Selected for Spotlight Presentation) - A Simpler Strong Refutation of Random k-XOR
Kwangjun Ahn
International Conference on Randomization and Computation (RANDOM 2020) - From Nesterov's Estimate Sequence to Riemannian Acceleration
Kwangjun Ahn and Suvrit Sra
Annual Conference on Learning Theory (COLT 2020) - Community Recovery in Hypergraphs
Kwangjun Ahn, Kangwook Lee, and Changho Suh
IEEE Transcations on Information Theory 2019 - Binary Rating Estimation with Graph Side Information
Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, and Changho Suh
Advances in Neural Information Processing Systems (NeurIPS 2018) - Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
Kwangjun Ahn, Kangwook Lee, and Changho Suh
IEEE Journal of Selected Topics in Signal Processing 2018. - Information-theoretic Limits of Subspace Clustering
Kwangjun Ahn, Kangwook Lee, and Changho Suh
IEEE International Symposium on Information Theory (ISIT 2017)
Technical Reports:
- Computing the Maximum Matching Width is NP-hard
Kwangjun Ahn and Jisu Jeong
Sep. 2017. - Riemannian Perspective on Matrix Factorization
Kwangjun Ahn and Felipe Suarez
Feb. 2021