AXRP - the AI X-risk Research Podcast

38.3 - Erik Jenner on Learned Look-Ahead

Dec 12, 2024
Erik Jenner, a third-year PhD student at UC Berkeley's Center for Human Compatible AI, dives into the fascinating world of neural networks in chess. He explores how these AI models exhibit learned look-ahead abilities, questioning whether they strategize like humans or rely on clever heuristics. The discussion also covers experiments assessing future planning in decision-making, the impact of activation patching on performance, and the relevance of these findings to AI safety and X-risk. Jenner's insights challenge our understanding of AI behavior in complex games.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Look-Ahead in Neural Networks

  • Strong chess-playing neural networks might rely on internal look-ahead mechanisms, similar to humans.
  • This challenges the notion that they solely depend on heuristics or intuition for good gameplay.
INSIGHT

Representing Future Moves

  • The research focuses on how the model represents future moves and their influence on present decisions.
  • The specific algorithm, whether it's goal-oriented or evaluative, remains unclear.
ANECDOTE

Intervention Experiments

  • Researchers probed the network and intervened on future moves to study their impact.
  • Blocking information about future checkmates significantly affected performance, suggesting look-ahead is used.
Get the Snipd Podcast app to discover more snips from this episode
Get the app