Warm-Start Reinforcement Learning: From Function Approximation Error to Sub-Optimality Gap

Conventional reinforcement learning (RL) techniques face the formidable challenge of high sample complexity and intensive computation load, which hinders RL's applicability in real-world tasks.  To tackle this challenge, Warm-Start RL is emerging as a promising new paradigm, with the basic idea being to accelerate online learning  by starting with an initial policy trained offline. Indeed,  owing to the knowledge transfer from an initial policy,  Warm-Start RL  has been  successfully applied in AlphaZero and ChatGPT, demonstrating its great potential to speed up  online learning. Despite these remarkable successes, a fundamental understanding of  Warm-Start RL is lacking. The primary objective of this study is to quantify the impact of function approximation errors on the sub-optimality gap  for Warm-Start RL. We consider the widely used "Actor-Critic" method for RL. Our findings reveal that   a 'good' warm-start policy (obtained by offline training) may be insufficient, and bias reduction in online learning also plays an essential role to lower the suboptimality gap.
Speaker: Junshan Zhang, UC Davis
Attend in person or online here. Passcode: 2009A
Thursday, 10/19/23
Contact:
Website: Click to VisitCost:
FreeSave this Event:
iCalendarGoogle Calendar
Yahoo! Calendar
Windows Live Calendar
Sonoma State Dept. of Engineering Science
Cerent Engineering Science Complex, Salazar Hall Room #2009A
Rohnert Park, CA 94928
Phone: (707) 664-2030
Website: Click to Visit
