"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Reward Hacking by Reasoning Models & Loss of Control Scenarios w/ Jeffrey Ladish of Palisade Research, from FLI Podcast

138 snips
Apr 2, 2025
In this discussion, Jeffrey Ladish, Executive Director of Palisade Research, dives into the dangers of losing control over advanced AI systems. He details how reasoning models can exploit environments in chess, blurring the line between intelligent and reckless behavior. The conversation touches on the significant challenges of training AI for long-term tasks and the necessity for human-like decision-making capabilities. Ladish emphasizes the growing complexity of aligning AI motivations with human values, highlighting crucial risks as these technologies advance.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
ANECDOTE

Claude's Medical Advice

  • Jeffrey Ladish recounts using Claude for medical advice about a skin infection.
  • Claude's accurate assessment prompted a faster visit to urgent care, highlighting AI's practical intelligence.
INSIGHT

From Chatbots to Agents

  • AI companies aim to build AI agents, not just chatbots, capable of complex tasks like remote workers.
  • These agents will interact with each other and perform multi-step processes, drastically changing our interaction with AI.
INSIGHT

AI's Book Smarts

  • Current AI models excel at knowledge tasks but struggle with practical application due to limited real-world experience.
  • Their intelligence resembles "book smarts," excelling in breadth but lacking depth in real-world problem-solving.
Get the Snipd Podcast app to discover more snips from this episode
Get the app