AI Safety Fundamentals: Alignment

Is Power-Seeking AI an Existential Risk?

May 13, 2023

The podcast explores the concern of existential risk from misaligned AI systems, discussing the potential for creating more intelligent agents than humans and the prediction of an existential catastrophe by 2070. It delves into the cognitive abilities of humans, the challenges of aligning AI systems with human values, and the concept of power-seeking AI. The chapter also explores the difficulties of ensuring good behavior in AI systems and the potential risks and consequences of misalignment. The podcast concludes with a discussion on the probabilities and uncertainties of existential catastrophe from power-seeking AI and the risk of permanent disempowerment of humanity.

Ask episode

Chapters

Transcript

Episode notes

Existential Risk from AI

The Unique Cognitive Abilities of Humans and the Potential for Advanced AI Systems

11:14 • 18min

Agentic Planning, Strategic Awareness, and Incentives for APS Systems

28:46 • 20min

Exploring Power-Seeking AI and the Challenge of Alignment

Unintended Behavior vs. Misaligned Behavior

Misaligned Behavior and Power-Seeking in AI

53:08 • 12min

Low-Convergence in AI Systems

01:05:12 • 14min

Shaping AI Communication and Behavior

01:19:42 • 7min

Exploring Robust Forms of Practical P.S. Alignment and Controlling Capabilities

01:26:38 • 5min

Specialized AI Systems vs General AI Systems

01:31:20 • 4min

Challenges of Power-Seeking Alignment in AI Systems

01:35:50 • 21min

Deploying AI Systems: Timing and Factors

01:57:09 • 19min

Challenges of Ensuring Good Behavior in AI Systems

02:16:08 • 17min

Power-seeking AI and the Challenges of Alignment

02:32:42 • 26min

The Moral Complexity of Power-Sharing with AI Agents

02:58:35 • 2min

Probabilities and Uncertainties of Existential Catastrophe from Power-Seeking AI

03:01:01 • 18min

Concerns over the risk of AI systems leading to permanent disempowerment of humanity

03:19:08 • 2min