"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Reward Hacking by Reasoning Models & Loss of Control Scenarios w/ Jeffrey Ladish of Palisade Research, from FLI Podcast

146 snips

Apr 2, 2025

Guest

Jeffrey Ladish

Guest

Gus Docker

In this discussion, Jeffrey Ladish, Executive Director of Palisade Research, dives into the dangers of losing control over advanced AI systems. He details how reasoning models can exploit environments in chess, blurring the line between intelligent and reckless behavior. The conversation touches on the significant challenges of training AI for long-term tasks and the necessity for human-like decision-making capabilities. Ladish emphasizes the growing complexity of aligning AI motivations with human values, highlighting crucial risks as these technologies advance.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

ANECDOTE

Claude's Medical Advice

Jeffrey Ladish recounts using Claude for medical advice about a skin infection.
Claude's accurate assessment prompted a faster visit to urgent care, highlighting AI's practical intelligence.

INSIGHT

From Chatbots to Agents

AI companies aim to build AI agents, not just chatbots, capable of complex tasks like remote workers.
These agents will interact with each other and perform multi-step processes, drastically changing our interaction with AI.

INSIGHT

AI's Book Smarts

Current AI models excel at knowledge tasks but struggle with practical application due to limited real-world experience.
Their intelligence resembles "book smarts," excelling in breadth but lacking depth in real-world problem-solving.

Get the Snipd Podcast app to discover more snips from this episode

Get the app