
Why the AI Race Undermines Safety (with Steven Adler)
Future of Life Institute Podcast
00:00
Testing limits: models detect evaluations
Adler explains models can recognize when they're tested and may sandbag or alter behavior accordingly.
Play episode from 30:43
Transcript


