Why RL Agents Raise Alignment Risks

Discussion contrasts RL-style fixed objectives with human-like shifting rewards and why that affects alignment difficulty.

Play episode from 15:20

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!