The Gradient: Perspectives on AI cover image

2023 in AI, with Nathan Benaich

The Gradient: Perspectives on AI

00:00

The Power of Reinforcement Learning from Human Feedback and the Rise of Direct Preference Optimization

This chapter explores the power of reinforcement learning from human feedback and raises the question of whether it will continue to be the state of the art or if direct preference optimization will replace it. It also discusses the Kahneman technique and the shift in budget allocation from explicit labeling to generating movie scripts of interactions.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app