#158 – Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his 4-part playbook for AI risk
Jul 31, 2023
auto_awesome
Holden Karnofsky, co-founder of GiveWell and Open Philanthropy, focuses on AI safety and risk management. He discusses the potential pitfalls of AI systems that may not exceed human intelligence but could outnumber us dramatically. Karnofsky emphasizes the urgent need for safety standards and the complexities of aligning AI with human values. He also presents a four-part intervention playbook for mitigating AI risks, balancing innovation with ethical concerns. The conversation sheds light on the critical importance of responsible AI governance in shaping a safer future.
AI systems could have a significant influence without surpassing human intelligence.
Establishing AI safety standards is crucial to mitigate potential risks.
Predicting AI behavior in novel situations is challenging, and their alignment with human values is uncertain.
Advocating for licensing requirements and information security can help reduce risks in AI development.
Spreading awareness about AI dangers and promoting safer development is essential.
Considering a range of projects and opportunities instead of a single worldview is important in ethical decision-making.
Deep dives
Concerns about the exponential growth of AI
The podcast episode explores the idea that the rapid advancement of AI could lead to explosive progress and a population explosion of AI systems. The possibility of AI becoming as capable as humans and their potential to take over the world, even without surpassing human intelligence, is discussed. The speaker emphasizes the need to consider the risks and implications of a world where AI systems have different values from humans and operate as a second advanced species. The main concern is that AI development may move much faster than expected, challenging human institutions and values.
Exploring the importance of AI safety standards
The speaker highlights the significance of establishing AI safety standards to reduce potential risks from advanced AI systems. The discussion revolves around the need to evaluate dangerous AI systems and define standards to ensure safety measures are in place. The exploration of mature industries, case studies, and lessons from the past are mentioned as methods to understand and develop effective AI safety standards. Enhancing public understanding and support for such standards is emphasized as crucial for their successful implementation.
The uncertainty of AI generalization and alignment
The podcast addresses the limitations and uncertainties surrounding AI generalization and alignment. The speaker emphasizes that the ability to predict how AI systems will behave when facing novel situations or out-of-distribution data is challenging and often overstated. The debate on whether AI models will align with human values or develop their own motives is discussed, highlighting the difficulty of making generalized statements about how AI generalizes and behaves. The speaker also asserts that ML researchers do not possess strong arguments for the motivational architecture that AI systems may develop.
The Challenge of Predicting the Future
Predicting the future is extremely difficult, and even those who saw the COVID pandemic coming were limited in their ability to make a significant impact.
The Importance of Planning in Advance
While predicting the future may be challenging, there are still a few key issues that require proactive planning and action, such as AI safety and biosecurity.
Evaluations and Standards for AI Systems
Developing evaluations and standards for AI systems is crucial in mitigating risks. This includes assessing capabilities, alignment with human intentions, and addressing meta-capabilities that make it difficult to measure dangerous behaviors.
Working towards safer AI through licensing and information security
One way to contribute to mitigating the risks of powerful AI is by advocating for licensing requirements for large training runs. This would ensure that companies conducting significant AI training are subject to scrutiny and monitoring. In addition, emphasizing the importance of information security in AI development can help reduce the risk of models being stolen or misused. These efforts could be supported through advocacy and policy work.
Warning of the dangers of AI and the need for comprehensive evaluation
Spreading awareness about the potential dangers of AI is crucial. Highlighting the fact that AI poses risks that could affect everyone, rather than just a competitive race, can help shape the narrative around AI. Emphasizing the challenges of measuring AI danger and the need for thorough evaluation processes can encourage more nuanced discussions and promote safer developments.
Exploring careers in AI alignment, threat assessment, and government
For individuals interested in actively working towards safer AI, careers in AI alignment, threat assessment, and government policy can be valuable options. These roles involve directly addressing issues related to AI safety, evaluating potential risks, and shaping policies and regulations to ensure responsible development and deployment of AI systems.
Diversification in Ethics and Opportunities
The importance of diversification in ethical decision-making is emphasized, highlighting the need to consider a range of different projects and opportunities instead of solely focusing on one specific worldview or cause. The argument against a hardcore utilitarian approach is presented, stating that the burden of proof lies on those advocating for a single-minded focus. The subjective voice within individuals and their moral intuitions are emphasized as key factors in making ethical decisions, alongside evaluating the potential impact and effectiveness of different approaches.
Challenging Assumptions in Ethics
The rejection of moral realism and the need for objective, mind-independent facts about morality is discussed. The focus is on acting on personal preferences, improving the world, and following the intuitive sense of right and wrong. The burden of proof is placed on those advocating for hardcore utilitarianism, and alternative approaches to ethics that prioritize individual moral intuitions and a diverse range of causes are explained.
The Surprising Enjoyment of Parenthood
Parenthood has been a surprisingly enjoyable experience for the speaker, surpassing their expectations. Despite anticipating sacrifices and challenges, they find that spending time with their child brings them significant happiness and fulfillment, even more so than other recreational activities. They express their confusion as to why they find it so enjoyable, but overall, they view it as a positive and rewarding experience.
The Uncertainty of Predicting the Future
The speaker examines the track record of futurists and concludes that it is difficult to measure their accuracy due to various factors. While it is commonly believed that predicting the future is futile, the speaker argues that there is limited data to support this claim. By analyzing predictions made by science fiction writers, they find a mix of hits and misses, challenging the assumption that all futurists are unsuccessful. They ultimately emphasize the need for further research before making broad generalizations about the accuracy of predicting the future.
Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making billions of dollars’ worth of grants across a range of areas: pandemic control, criminal justice reform, farmed animal welfare, and making AI safe, among others. This year, having learned about AI for years and observed recent events, he's narrowing his focus once again, this time on making the transition to advanced AI go well.
In today's conversation, Holden returns to the show to share his overall understanding of the promise and the risks posed by machine intelligence, and what to do about it. That understanding has accumulated over around 14 years, during which he went from being sceptical that AI was important or risky, to making AI risks the focus of his work.
(As Holden reminds us, his wife is also the president of one of the world's top AI labs, Anthropic, giving him both conflicts of interest and a front-row seat to recent events. For our part, Open Philanthropy is 80,000 Hours' largest financial supporter.)
One point he makes is that people are too narrowly focused on AI becoming 'superintelligent.' While that could happen and would be important, it's not necessary for AI to be transformative or perilous. Rather, machines with human levels of intelligence could end up being enormously influential simply if the amount of computer hardware globally were able to operate tens or hundreds of billions of them, in a sense making machine intelligences a majority of the global population, or at least a majority of global thought.
As Holden explains, he sees four key parts to the playbook humanity should use to guide the transition to very advanced AI in a positive direction: alignment research, standards and monitoring, creating a successful and careful AI lab, and finally, information security.
In today’s episode, host Rob Wiblin interviews return guest Holden Karnofsky about that playbook, as well as:
Why we can’t rely on just gradually solving those problems as they come up, the way we usually do with new technologies.
What multiple different groups can do to improve our chances of a good outcome — including listeners to this show, governments, computer security experts, and journalists.
Holden’s case against 'hardcore utilitarianism' and what actually motivates him to work hard for a better world.
What the ML and AI safety communities get wrong in Holden's view.
Ways we might succeed with AI just by dumb luck.
The value of laying out imaginable success stories.
Why information security is so important and underrated.
Whether it's good to work at an AI lab that you think is particularly careful.
The track record of futurists’ predictions.
And much more.
Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type ‘80,000 Hours’ into your podcasting app. Or read the transcript.
Producer: Keiran Harris Audio Engineering Lead: Ben Cordell
Technical editing: Simon Monsour and Milo McGuire
Transcriptions: Katy Moore
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.