
AI Safety Fundamentals: Alignment
Embedded Agents
May 13, 2023
Computer scientist Scott Garrabrant discusses the challenge of building learning agents for real-world goals. The podcast explores the concept of embedded agents, the four complications of embedded agency, and open problems in world models and subsystem alignment. It also delves into the conflicts that arise when spinning up sub-agents with different goals.
17:39
Episode guests
AI Summary
AI Chapters
Episode notes
Podcast summary created with Snipd AI
Quick takeaways
- Embedded agents face challenges in optimizing realistic goals in physical environments without clear input-output channels.
- Decision theory, embedded world models, robust delegation, and subsystem alignment are active areas of research to address these challenges.
Deep dives
Decision Theory
Decision theory explores the challenges of optimization for embedded agents. Dualistic models like argmax, which find actions that maximize rewards, do not apply well to agents embedded in environments without clear input-output channels. Major open problems in decision theory include reasoning about counterfactuals, handling multiple agent copies within an environment, and addressing logical uncertainty.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.