The Gradient: Perspectives on AI cover image

2023 in AI, with Nathan Benaich

The Gradient: Perspectives on AI

CHAPTER

The Power of Reinforcement Learning from Human Feedback and the Rise of Direct Preference Optimization

This chapter explores the power of reinforcement learning from human feedback and raises the question of whether it will continue to be the state of the art or if direct preference optimization will replace it. It also discusses the Kahneman technique and the shift in budget allocation from explicit labeling to generating movie scripts of interactions.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner