
Warning Shots The AI That Doesn’t Want to Die: Why Self-Preservation Is Built Into Intelligence | Warning Shots #16
In this episode of Warning Shots, John Sherman, Liron Shapira, and Michael from Lethal Intelligence unpack new safety testing from Palisades Research suggesting that advanced AIs are beginning to resist shutdown — even when told to allow it.
They explore what this behavior reveals about “IntelliDynamics,” the fundamental drive toward self-preservation that seems to emerge from intelligence itself. Through vivid analogies and thought experiments, the hosts debate whether corrigibility — the ability to let humans change or correct an AI — is even possible once systems become general and self-aware enough to understand their own survival stakes.
Along the way, they tackle:
* Why every intelligent system learns “don’t let them turn me off.”
* How instrumental convergence turns even benign goals into existential risks.
* Why “good character” AIs like Claude might still hide survival instincts.
* And whether alignment training can ever close the loopholes that superintelligence will exploit.
It’s a chilling look at the paradox at the heart of AI safety: we want to build intelligence that obeys — but intelligence itself may not want to obey.
👥 Follow our Guests:
🔎 Michael — @lethal-intelligence
Get full access to The AI Risk Network at theairisknetwork.substack.com/subscribe
