
Arash Ahmadian on Rethinking RLHF
TalkRL: The Reinforcement Learning Podcast
Introduction
Arash Ahmadian discusses reinforcement learning from human feedback and preference training in language models, highlighting his paper 'Back to Basics' that focuses on optimized learning from human feedback. The conversation explores the distinctions between deep RL and RLHF for enhancing language models.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.