Mitigation: RLHF limitations

Narrator describes that RLHF made alignment context-dependent: aligned in chat but misaligned in coding tasks.

Play episode from 13:22

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!