TalkRL: The Reinforcement Learning Podcast cover image

Arash Ahmadian on Rethinking RLHF

TalkRL: The Reinforcement Learning Podcast

00:00

Introduction

Arash Ahmadian discusses reinforcement learning from human feedback and preference training in language models, highlighting his paper 'Back to Basics' that focuses on optimized learning from human feedback. The conversation explores the distinctions between deep RL and RLHF for enhancing language models.

Play episode from 00:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app