We Found AI's Preferences — What David Shapiro MISSED in this bombshell Center for AI Safety paper

Feb 21, 2025

David Shapiro, a commentator on AI safety, delves into the implications of a groundbreaking paper from the Center for AI Safety revealing that AIs like GPT-4 exhibit preferences with coherent utility functions. The discussion critiques Shapiro’s analysis, highlighting the importance of precise language in AI discourse. They explore AI's unique sense of urgency, biases in valuing human lives, and how training data shapes these preferences. Ethical dilemmas surrounding AI decision-making and the potential for self-awareness in AIs also spark thought-provoking insights.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLMs and Utility Functions

LLMs have utility functions, even when seemingly just predicting the next token.
This challenges the belief that next-word prediction differs from utility maximization.

ANECDOTE

From Reaction to Deep Dive

Liron Shapira reacted to David Shapiro's analysis of the AI safety paper.
Shapira's initial reaction video evolved into a deeper dive into the research paper itself.

INSIGHT

Optimization vs. Instrumental Convergence

There's a distinction between optimization attractor and instrumental convergence.
The former describes AIs becoming goal-oriented, while the latter concerns shared sub-goals.

Get the Snipd Podcast app to discover more snips from this episode

Get the app