AI Safety Fundamentals: Alignment

Is Power-Seeking AI an Existential Risk?

May 13, 2023
The podcast explores the concern of existential risk from misaligned AI systems, discussing the potential for creating more intelligent agents than humans and the prediction of an existential catastrophe by 2070. It delves into the cognitive abilities of humans, the challenges of aligning AI systems with human values, and the concept of power-seeking AI. The chapter also explores the difficulties of ensuring good behavior in AI systems and the potential risks and consequences of misalignment. The podcast concludes with a discussion on the probabilities and uncertainties of existential catastrophe from power-seeking AI and the risk of permanent disempowerment of humanity.
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 3min
2
Existential Risk from AI
02:35 • 9min
3
The Unique Cognitive Abilities of Humans and the Potential for Advanced AI Systems
11:14 • 18min
4
Agentic Planning, Strategic Awareness, and Incentives for APS Systems
28:46 • 20min
5
Exploring Power-Seeking AI and the Challenge of Alignment
49:08 • 2min
6
Unintended Behavior vs. Misaligned Behavior
50:54 • 2min
7
Misaligned Behavior and Power-Seeking in AI
53:08 • 12min
8
Low-Convergence in AI Systems
01:05:12 • 14min
9
Shaping AI Communication and Behavior
01:19:42 • 7min
10
Exploring Robust Forms of Practical P.S. Alignment and Controlling Capabilities
01:26:38 • 5min
11
Specialized AI Systems vs General AI Systems
01:31:20 • 4min
12
Challenges of Power-Seeking Alignment in AI Systems
01:35:50 • 21min
13
Deploying AI Systems: Timing and Factors
01:57:09 • 19min
14
Challenges of Ensuring Good Behavior in AI Systems
02:16:08 • 17min
15
Power-seeking AI and the Challenges of Alignment
02:32:42 • 26min
16
The Moral Complexity of Power-Sharing with AI Agents
02:58:35 • 2min
17
Probabilities and Uncertainties of Existential Catastrophe from Power-Seeking AI
03:01:01 • 18min
18
Concerns over the risk of AI systems leading to permanent disempowerment of humanity
03:19:08 • 2min