Anas Barakat
Anas Barakat
Home
Research
Talks
Teaching
CV
Contact
Light
Dark
Automatic
On the Sample Complexity of a Policy Gradient Algorithm with Occupancy Approximation for General Utility Reinforcement Learning
Anas Barakat
,
Souradip Chakraborty
,
Peihong Yu
,
Pratap Tokekar
,
Amrit Singh Bedi
October 2024
Arxiv
Type
Conference paper
Publication
Under review
Related
Beyond Expected Returns: A Policy Gradient Algorithm for Cumulative Prospect Theoretic Reinforcement Learning
Policy Mirror Descent with Lookahead
Independent Learning in Constrained Markov Potential Games
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
Cite
×