Statistics & Optimization for Trustworthy AI
Our Research
We develop principled and empirically-impactful AI/ML methods
- mathematical foundations for transformers, sequence modeling, and capabilities of language models
- core optimization and statistical learning theory
- trustworthy language and time-series (foundation) models
- reinforcement learning, control, LLMs as interactive agents
Announcements
- I am recruiting a postdoctoral scholar starting around January 2025. Please email me a CV to apply.
- Topics of interest: language modeling theory and algos, time-series and tabular data, reasoning
- Also consider applying to MIDAS and Schmidt AI Fellowships: https://midas.umich.edu/training/postdocs/ai-in-science/apply/
- My student Yingcong Li will be in the academic job market!
- We invite expository papers on generative models to IEEE BITS special issue: tinyurl.com/n2t84sna
Recent news
- Congrats to Mingchen on his graduation and joining Meta as a Research Scientist!
- 4 papers are accepted to NeurIPS 2024
- “Selective Attention: Enhancing Transformer through Principled Context Control” (paper coming)
- Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning
- Fine-grained Analysis of In-context Linear Estimation
- CONTRAST: Continual Multi-source Adaptation to Dynamic Distributions
- I will serve as a Senior Area Chair for NeurIPS 2024.
- Congrats to our 2023 interns who will pursue their PhD studies in UC Berkeley, Harvard, and UIUC!
- Two papers at ICML 2024: Self-Attention <=> Markov Models and Can Mamba Learn How to Learn?
- New course on Foundations of Large Language Models: syllabus (including Piazza and logistics)
- New awards from NSF and ONR: We kickstarted two exciting projects to advance the theoretical and algorithmic foundations of LLMs, transformers, and their compositional learning capabilities.
- Two papers at AISTATS 2024
- “Mechanics of Next Token Prediction with Self-Attention”, Y. Li, Y. Huang, M.E. Ildiz, A.S. Rawat, S.O.
- “Inverse Scaling and Emergence in Multitask Representations“, M.E. Ildiz, Z. Zhao, S.O.
- Two papers at AAAI 2024 and one paper at WACV 2024
- Invited talks at USC, INFORMS, Yale, Google NYC, and Harvard on our works on transformer theory
- Two papers at NeurIPS 2023
- Grateful for the Adobe Data Science Research award!
- Our new works develop the optimization foundations of Transformers via SVM connection
- Two papers at ICML 2023: Transformers as Algorithms and On the Role of Attention in Prompt-tuning
- Two papers at AAAI 2023: Provable Pathways and Long Horizon Bandits