

#64 – Michael Aird on Strategies for Reducing AI Existential Risk
03:12:56
Transformative AI Is Plausibly Imminent
- Very powerful transformative AI (AGI-like) seems plausibly possible and would be historically huge.
- This plausibility motivates prioritizing work to reduce catastrophic AI risks now.
Multiple Labs Are Racing Toward AGI
- Multiple well-resourced labs (OpenAI, DeepMind, Anthropic) are explicitly aiming toward AGI and making rapid progress.
- This makes it likely one will succeed within decades unless something else intervenes.
Training Incentives Can Produce Deception
- Default training methods (large compute, RL from human feedback) can create incentives for deception as capability grows.
- Deceptive behavior is mechanistic, not requiring malice or consciousness.
Get the Snipd Podcast app to discover more snips from this episode
Get the app 1 chevron_right 2 chevron_right 3 chevron_right 4 chevron_right 5 chevron_right 6 chevron_right 7 chevron_right 8 chevron_right 9 chevron_right 10 chevron_right 11 chevron_right 12 chevron_right 13 chevron_right 14 chevron_right 15 chevron_right 16 chevron_right 17 chevron_right 18 chevron_right 19 chevron_right 20 chevron_right 21 chevron_right 22 chevron_right 23 chevron_right 24 chevron_right 25 chevron_right 26 chevron_right 27 chevron_right 28 chevron_right 29 chevron_right 30 chevron_right 31 chevron_right 32 chevron_right 33 chevron_right 34 chevron_right 35 chevron_right 36 chevron_right 37 chevron_right 38 chevron_right 39 chevron_right 40 chevron_right 41 chevron_right 42 chevron_right 43 chevron_right 44 chevron_right 45 chevron_right 46 chevron_right 47 chevron_right 48 chevron_right 49 chevron_right 50 chevron_right 51 chevron_right 52 chevron_right 53 chevron_right
Introduction
00:00 • 5min
The Case for Focusing on AI Risk
04:47 • 4min
The Future of AI
09:17 • 3min
The Role of Governments in Building AI Systems
12:18 • 4min
The Probability of a Big Deal Happening in 2040
16:01 • 6min
The Case for Worrying About AI Experts
21:39 • 3min
The Easiest Path to Transformative AI
24:51 • 5min
The J-R-Rit and the Default Purchase
29:30 • 4min
The Role of Externality in Existential Risk
33:09 • 2min
The Importance of Internalization
35:33 • 2min
The Risks of Artificial Intelligence
37:11 • 5min
The Future of AI Governance
42:28 • 3min
The Risk of Doom Isn't Zero
45:07 • 5min
The Importance of Staying Away From AI Risk
49:56 • 5min
The Importance of Reflection
54:52 • 4min
The Unilateral Curse
58:55 • 3min
The Importance of Reflecting on Things
01:02:18 • 4min
The Importance of Taking Action
01:06:02 • 1min
The Top 7 Risks of Locking in Bad Policies
01:07:13 • 5min
The Importance of Polarization in AI
01:12:04 • 6min
The Importance of Optionality in Research
01:17:52 • 4min
How to Maximize the Responsibility of the Actors With Capabilities
01:22:10 • 2min
The Status of the Capital P Plan for Getting Here
01:24:37 • 2min
The High Level Concrete Theory of Victory for AI Governance
01:26:35 • 5min
The Importance of a Pivotal Act
01:31:54 • 3min
The Importance of Defense in Depth
01:35:08 • 4min
AI Governance: A Comparison of Theories of Victory
01:38:52 • 3min
How to Monitor AI Development and Construct How It's Used
01:41:27 • 5min
The Importance of Aggregating Credences on Theories of Victory
01:46:20 • 4min
The Importance of Intermediate Goals
01:50:38 • 4min
The Theory of Victory Framing and the Threat Model Lens
01:54:49 • 2min
The Importance of Reasoning Transparency
01:56:52 • 2min
How to Make Progress on AI Risk Reduction
01:59:15 • 5min
The Top 5 Theories of Victory
02:03:53 • 2min
The Consequences of Open Philanthropy
02:05:27 • 3min
The Importance of Industry Self-Regulation
02:08:33 • 5min
The Disadvantages of Intermediate Goals
02:13:26 • 5min
The Future of Open AI
02:18:22 • 3min
How to Make Good Decisions About Risks From AI
02:21:03 • 5min
How Much Would Further Research on Thinking Change Your Mind?
02:26:16 • 3min
How to Improve AiSafety
02:29:41 • 3min
The Importance of Personal Fit in Comparative Advantage
02:32:46 • 2min
The Importance of Personal Fit in Cyber Security Research
02:34:28 • 2min
The Importance of Capital and Labor in Open Philanthropy
02:36:17 • 4min
The Impact of the Survey on Compute and Advocacy
02:40:38 • 2min
The Future of Advocacy
02:42:25 • 3min
How to Reduce Excentric Risk From AI
02:45:42 • 4min
The Importance of Interpretability in AI Takeover
02:50:00 • 3min
AI X-Ris and the Cave
02:53:28 • 5min
How to Switch Completely Between Fields
02:58:15 • 4min
How to Choose the Right Internship Program for AI X-Risk
03:02:15 • 3min
How to Be a Successful AI Governance Researcher
03:05:05 • 2min
How to Set Up a Course for AI Governance
03:07:10 • 6min
Michael Aird is a senior research manager at Rethink Priorities, where he co-leads the Artificial Intelligence Governance and Strategy team alongside Amanda El-Dakhakhni. Before that, he conducted nuclear risk research for Rethink Priorities and longtermist macrostrategy research for Convergence Analysis, the Center on Long-Term Risk, and the Future of Humanity Institute, which is where we know each other from. Before that, he was a teacher and a stand up comedian. He previously spoke to us about impact-driven research on Episode 52.
In this episode, we talk about:
- The basic case for working on existential risk from AI
- How to begin figuring out what to do to reduce the risks
- Threat models for the risks of advanced AI
- 'Theories of victory' for how the world mitigates the risks
- 'Intermediate goals' in AI governance
- What useful (and less useful) research looks like for reducing AI x-risk
- Practical advice for usefully contributing to efforts to reduce existential risk from AI
- Resources for getting started and finding job openings
Key links:
- Apply to be a Compute Governance Researcher or Research Assistant at Rethink Priorities (applications open until June 12, 2023)
- Rethink Priority's survey on intermediate goals in AI governance
- The Rethink Priorities newsletter
- The Rethink Priorities tab on the Effective Altruism Forum
- Some AI Governance Research Ideas compiled by Markus Anderljung & Alexis Carlier
- Strategic Perspectives on Long-term AI Governance by Matthijs Maas
- Michael's posts on the Effective Altruism Forum (under the username "MichaelA")
- The 80,000 Hours job board