Anas Barakat

Research Fellow

Singapore University of Technology and Design

I am currently a Research Fellow at Singapore University of Technology and Design, working with Georgios Piliouras and Antonios Varvitsiotis.

My primary research interests are in reinforcement learning and multi-agent learning. I develop theory and algorithms for sequential decision-making and learning under uncertainty. I study how learning dynamics evolve in structured, non-stationary, and strategic environments, and I design algorithms with provable guarantees on their long-term behavior. More recently, I have been exploring how these ideas can inform post-training and alignment in large language models. My work draws on tools from game theory, online learning, stochastic optimization and dynamical systems.

Previously, I was a postdoctoral fellow at ETH Zurich working with Niao He in the department of computer science. I obtained my PhD in applied mathematics and computer science from Institut Polytechnique de Paris at Télécom Paris under the supervision of Pascal Bianchi and Walid Hachem. I received my engineering Master's degree in applied mathematics and computer science from Télécom Paris and a Master's degree in data science from Université Paris Saclay.

Here is my CV for more information.

News

02/2026: Check out our new work: `Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training’, arxiv link.
02/2026: ‘Optimistic Online Learning in Symmetric Cone Games’ accepted to Transactions on Machine Learning Research, link.
01/2026: ‘Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria’ accepted to AISTATS 2026, arxiv link.
10/2025: Designing and teaching a new course with John Lazarsfeld and Iosif Sakos. Check out our website: Online Learning and Learning in Games.
09/2025: `On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning’ accepted to NeurIPS 2025.

Interests

Multi-Agent Learning
Reinforcement Learning
Stochastic Optimization

Education

PhD in Applied Mathematics and CS, 2021

Institut Polytechnique de Paris (Télécom Paris)
MSc in Data Science, 2018

Université Paris Saclay
MSc in Applied Mathematics and CS, 2018

Télécom Paris

Research

Multi-Agent Learning in Strategic and Dynamic Environments

Multi-Agent Reinforcement Learning
Learning in Games
Online Learning

Anas Barakat, Ioannis Panageas, Antonios Varvitsiotis (2026). Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria. AISTATS 2026.

Arxiv

Anas Barakat, John Lazarsfeld, Georgios Piliouras, Antonios Varvitsiotis (2025). Online Multi-Agent Control with Adversarial Disturbances. Under review.

Slides Arxiv

Anas Barakat, Wayne Lin, John Lazarsfeld, Antonios Varvitsiotis (2025). Optimistic Online Learning in Symmetric Cone Games. Transactions on Machine Learning Research 2026.

Slides Journal Arxiv

Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He (2025). Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last Iterate Convergence. SIAM Journal on Control and Optimization 63 (5), 3244-3271.

Journal Arxiv

Pragnya Alatur, Anas Barakat*, Niao He (2024). Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players. IEEE CDC 2024. *corresponding author.

Slides Proceedings Arxiv

Philip Jordan, Anas Barakat, Niao He (2023). Independent Learning in Constrained Markov Potential Games. AISTATS 2024.

Poster Slides Proceedings Arxiv

Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He (2023). Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity. IEEE CDC 2023.

Proceedings Arxiv

Reinforcement Learning

RL with general utilities, convex RL
Policy gradient algorithms
Alignment and LLM Post-Training

Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi (2026). Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training. Under review.

Arxiv

Olivier Lepel, Anas Barakat (2025). Policy Gradients for Cumulative Prospect Theory in Reinforcement Learning. Under review.

Slides Arxiv

Anas Barakat, Souradip Chakraborty, Peihong Yu, Pratap Tokekar, Amrit Singh Bedi (2025). On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning. NeurIPS 2025.

Poster Proceedings Arxiv

Kimon Protopapas, Anas Barakat (2024). Policy Mirror Descent with Lookahead. NeurIPS 2024.

Poster Proceedings Arxiv

Anas Barakat, Ilyas Fatkhullin, Niao He (2023). Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space. ICML 2023.

Poster Slides Proceedings Arxiv

Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He (2023). Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies. ICML 2023.

Poster Proceedings Arxiv

Anas Barakat, Pascal Bianchi, Julien Lehmann (2022). Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation. AISTATS 2022.

Poster Proceedings Arxiv

Stochastic Optimization

Stochastic approximation
Dynamical systems
Non-convex stochastic optimization
Adaptive gradient methods

Anas Barakat, Pascal Bianchi, Walid Hachem, Sholom Schechtman (2021). Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance. Electronic Journal of Statistics 15 (2), 3892-3947.

DOI Arxiv Journal

Anas Barakat, Pascal Bianchi (2020). Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization. SIAM Journal on Optimization, 31 (1), 244-274.

Poster Slides Video DOI Arxiv Journal

Anas Barakat, Pascal Bianchi (2019). Convergence Rates of a Momentum Algorithm with Bounded Adaptive Step Size for Non-Convex Optimization. ACML 2020.

Poster Slides Proceedings Arxiv

Talks

Invited talk - 5th Symposium on Machine Learning and Dynamical Systems

Feb 10, 2026 2:00 PM Kyoto University, Kyoto, Japan

Invited talk - ICCOPT 2025

Jul 24, 2025 3:00 PM University of Southern California, Los Angeles, USA

Invited talk - Learning Theory and Applications Workshop, NTU

Apr 29, 2025 4:00 PM College of Computing and Data Science, NTU, Singapore

Invited talk - Finance and RL Talks

Apr 21, 2025 10:00 PM Online

Invited talk - 4th Symposium on Machine Learning and Dynamical Systems

Jul 8, 2024 2:45 PM Fields Institute, Toronto, Canada

See all talks

Reviewing

Conferences: NeurIPS, ICML, ICLR, AISTATS, EC.
Journals: Journal of Machine Learning Research (JMLR), Transactions on Machine Learning Research (TMLR), Mathematical Programming, SIAM Journal on Optimization (SIOPT), Journal of Optimization Theory and Applications (JOTA), IEEE Transactions on Automatic Control.

Teaching

SUTD (2024-2025):

Co-instructor and course designer ( Online Learning and Learning in Games) with John Lazarsfeld and Iosif Sakos.

ETH Zurich (2022-2024):

Teaching Assistant and instructor for 2 lectures for the course Optimization for Data Science (Prof. Niao He).
Head Teaching Assistant and instructor for 1 lecture (on value-based methods) for Foundations of Reinforcement Learning (Prof. Niao He).
Coordinator of the seminar course ‘Advanced Topics in Machine Learning’.

Télécom Paris (2018-2021): Teaching Assistant

Optimization for Machine Learning (graduate level), Statistics (graduate level), Probabilities (undergraduate level).

Anas Barakat

Research Fellow

Singapore University of Technology and Design

Interests

Education

Research

Multi-Agent Learning in Strategic and Dynamic Environments

Reinforcement Learning

Stochastic Optimization

Talks

Reviewing

Teaching

Contact