Practical Recommendations for AI Developers

The hosts recommend studying but avoiding training on misaligned behaviors that boost eval awareness unless alignment gains are clearly genuine.

Play episode from 25:26

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!