Interconnects

RLHF Roundup: Trying to get good at PPO, charting RLHF's impact, RewardBench retrospective, and a reward model competition

Jun 26, 2024
Exploring the impact of RLHF in training language models, a retrospective on RewardBench's performance, and the competition for reward modeling are discussed in this insightful podcast. The podcast also delves into the challenges and progress in reinforcement learning through human feedback, comparing DPO and PPO models, and a competition predicting user preferences among large language models.
Ask episode
Chapters
Transcript
Episode notes