Clearer Thinking with Spencer Greenberg

How can AIs know what we want if *we* don't even know? (with Geoffrey Irving)

9 snips
Jan 24, 2024
Geoffrey Irving, an AI safety researcher at DeepMind with a rich background at OpenAI and Google Brain, delves into the intricate challenge of aligning AI systems with human values. He discusses how AIs can misinterpret user intentions and the philosophical differences between being an assistant and an autonomous agent. Irving also examines the biases in AI training, particularly from WEIRD cultures, and the potential for AI to manipulate human emotions. He emphasizes the need for diverse cultural representation and ethical guidelines to ensure responsible AI development.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

AI Alignment

  • Aligning AI systems means making them do what humans want.
  • This seemingly simple goal is complex due to the ambiguity of "what we want".
ADVICE

Focus on Assistance

  • Focus on designing AI systems for assistive tasks, rather than fully autonomous actions.
  • This simplifies defining and evaluating "good" behavior.
INSIGHT

Bridging Informational and Autonomous AI

  • One can bridge the gap between informational and autonomous AI by simulating human approval.
  • AI could list actions, predict human endorsement, and act if confident.
Get the Snipd Podcast app to discover more snips from this episode
Get the app