Navigating Over-Optimization in RL

This chapter explores over-optimization in reinforcement learning, highlighting three key types: RL control, RLHF, and RLVR. It delves into the complexities of reward design, illustrating how poorly structured rewards can lead to unintended model behaviors, especially in mixed domains like coding and mathematics.

Play episode from 54:27

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)

Latent Space: The AI Engineer Podcast

Navigating Over-Optimization in RL

Chapters

The AI-powered Podcast Player