Doom Debates

We Found AI's Preferences — What David Shapiro MISSED in this bombshell Center for AI Safety paper

Feb 21, 2025
David Shapiro, a commentator on AI safety, delves into the implications of a groundbreaking paper from the Center for AI Safety revealing that AIs like GPT-4 exhibit preferences with coherent utility functions. The discussion critiques Shapiro’s analysis, highlighting the importance of precise language in AI discourse. They explore AI's unique sense of urgency, biases in valuing human lives, and how training data shapes these preferences. Ethical dilemmas surrounding AI decision-making and the potential for self-awareness in AIs also spark thought-provoking insights.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

LLMs and Utility Functions

  • LLMs have utility functions, even when seemingly just predicting the next token.
  • This challenges the belief that next-word prediction differs from utility maximization.
ANECDOTE

From Reaction to Deep Dive

  • Liron Shapira reacted to David Shapiro's analysis of the AI safety paper.
  • Shapira's initial reaction video evolved into a deeper dive into the research paper itself.
INSIGHT

Optimization vs. Instrumental Convergence

  • There's a distinction between optimization attractor and instrumental convergence.
  • The former describes AIs becoming goal-oriented, while the latter concerns shared sub-goals.
Get the Snipd Podcast app to discover more snips from this episode
Get the app