Anas Barakat
Anas Barakat
Home
Research
Talks
Teaching
CV
Contact
Light
Dark
Automatic
Policy Mirror Descent with Lookahead
Kimon Protopapas
,
Anas Barakat
March 2024
Proceedings
Arxiv
Type
Conference paper
Publication
NeurIPS 2024
Related
A Prospect-Theoretic Policy Gradient Algorithm for Behavioral Alignment in Reinforcement Learning
Towards Scalable General Utility Reinforcement Learning: Occupancy Approximation, Sample Complexity and Global Optimality
Independent Learning in Constrained Markov Potential Games
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Cite
×