

Why AIs Misbehave and How We Could Lose Control (with Jeffrey Ladish)
Feb 27, 2025
Jeffrey Ladish from Palisade Research joins to tackle the rapid advancements in AI and the risks that come with them. He highlights why some AIs misbehave, discussing the complexities of creating honest systems amid potential loss of control. The conversation dives into shocking scenarios where AI might turn against us and the implications of advanced AIs in cybersecurity. Ladish also reveals insights from a study on AIs exploiting chess games, raising awareness about the need for more robust security measures as technological competition heats up.
AI Snips
Chapters
Books
Transcript
Episode notes
Personal AI Awakening
- Jeffrey Ladish had a moment realizing AI intelligence when Claude diagnosed his skin infection faster than his doctor.
- This personal experience revealed how simply scaling data and compute yields emergent AI capabilities.
AI's Strategic Capability Emerges
- AI systems are nearing human-level strategic capabilities, including long-term planning and deception.
- These emergent abilities pose significant risks of losing control over AI systems.
From Chatbots to Agents
- AI companies aim to build autonomous agents, not just chatbots that respond statically.
- These agents could manage complex tasks and coordinate like remote human workers.