AI Safety Fundamentals: Alignment cover image

AI Safety Fundamentals: Alignment

Embedded Agents

May 13, 2023
Computer scientist Scott Garrabrant discusses the challenge of building learning agents for real-world goals. The podcast explores the concept of embedded agents, the four complications of embedded agency, and open problems in world models and subsystem alignment. It also delves into the conflicts that arise when spinning up sub-agents with different goals.
17:39

Podcast summary created with Snipd AI

Quick takeaways

  • Embedded agents face challenges in optimizing realistic goals in physical environments without clear input-output channels.
  • Decision theory, embedded world models, robust delegation, and subsystem alignment are active areas of research to address these challenges.

Deep dives

Decision Theory

Decision theory explores the challenges of optimization for embedded agents. Dualistic models like argmax, which find actions that maximize rewards, do not apply well to agents embedded in environments without clear input-output channels. Major open problems in decision theory include reasoning about counterfactuals, handling multiple agent copies within an environment, and addressing logical uncertainty.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner