Avoiding Extreme Global Vulnerability as a Core AI Governance Problem
May 13, 2023
auto_awesome
The podcast covers various framings of the AI governance problem, the factors incentivizing harmful deployment of AI, the challenges and risks of delayed safety and rapid diffusion of AI capabilities, addressing the risks of widespread deployment of harmful AI, and approaches to avoiding extreme global vulnerability in AI governance.
11:40
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Catastrophic harm from AI may only require a minority of influential decision-makers with good intentions to cause a catastrophe.
Factors such as misjudgment, win-and-take-all competition, and a race to the bottom incentivize harmful AI deployment actions despite the risks.
Deep dives
The AI governance problem and its framing
The podcast discusses the framing and articulation of the AI governance problem from a catastrophic risk's lens. It aims to provide a synthesized introduction to the prominent framings, including concerns over value erosion from competition, expectations of centralized or decentralized AI deployment, the vulnerable world hypothesis, AI governance as a coordination problem, trade-offs between AI system performance and safety, and the potential for bad value lock-in. The episode highlights that catastrophic harm may not require all relevant decision-makers to make harmful decisions, and that just a minority of influential decision-makers, even with good intentions, could cause catastrophe.
Factors incentivizing harmful deployment actions
The podcast explains the factors that incentivize some actors to take harmful deployment actions despite the risks. These factors include misjudgment in assessing AI's consequences, win-and-take-all competition where being the first to deploy advanced AI leads to large gains, externalities that allow actors to benefit from deployment while incurring a fraction of the global risks, and a race to the bottom where actors cut corners to beat others to AI deployment, creating a dangerous feedback loop. The podcast suggests that these factors could lead to a significant period in which many actors have the capacity and incentives to unintentionally deploy catastrophically harmful AI, which could pose high risks.
Approaches to reduce AI deployment risks
The podcast explores several approaches to mitigate the risks associated with AI deployment. These approaches include non-proliferation, where actors coordinate to slow the diffusion of risky AI capabilities, deterrence through creating disincentives for harmful deployment actions, assurance through mechanisms to assure each other's safe development intentions, awareness by providing information about AI risks to potential developers, sharing of benefits and influence to mitigate winner-takes-all incentives, and speeding up safety through shortening or eliminating the period where dangerous deployment decisions are possible before affordable predictive technologies are available. The episode concludes by mentioning the relevance of these approaches to concerns over catastrophic misuse of narrow AI and the importance of guard railing competition on AI while also considering the risks of an overly centralized approach to AI development.
Much has been written framing and articulating the AI governance problem from a catastrophic risks lens, but these writings have been scattered. This page aims to provide a synthesized introduction to some of these already prominent framings. This is just one attempt at suggesting an overall frame for thinking about some AI governance problems; it may miss important things. Some researchers think that unsafe development or misuse of AI could cause massive harms. A key contributor to some of these risks is that catastrophe may not require all or most relevant decision makers to make harmful decisions. Instead, harmful decisions from just a minority of influential decision makers—perhaps just a single actor with good intentions—may be enough to cause catastrophe. For example, some researchers argue, if just one organization deploys highly capable, goal-pursuing, misaligned AI—or if many businesses (but a small portion of all businesses) deploy somewhat capable, goal-pursuing, misaligned AI—humanity could be permanently disempowered. The above would not be very worrying if we could rest assured that no actors capable of these harmful actions would take them. However, especially in the context of AI safety, several factors are arguably likely to incentivize some actors to take harmful deployment actions: Misjudgment: Assessing the consequences of AI deployment may be difficult (as it is now, especially given the nature of AI risk arguments), so some organizations could easily get it wrong—concluding that an AI system is safe or beneficial when it is not. “Winner-take-all” competition: If the first organization(s) to deploy advanced AI is expected to get large gains, while leaving competitors with nothing, competitors would be highly incentivized to cut corners in order to be first—they would have less to lose.