The podcast discusses the shortage of researchers working on AI alignment compared to machine learning capabilities researchers. It highlights the limited research in the field of alignment and emphasizes the need for a more rigorous and concerted effort. Approaches to achieving alignment in AGI are explored, along with the challenge of aligning AI systems with human values in superhuman AGI. The significance of involving talented ML researchers in solving the alignment problem is emphasized, stressing the need for focused research on tackling the core difficulties of the technical problem.
17:12
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
The number of researchers working on AGI alignment is surprisingly low compared to those working on machine learning capabilities.
Alignment techniques relying on human supervision will not scale to superhuman AGI systems.
Deep dives
Lack of Focus on AGI Alignment
Despite concerns about AI risk and the perception of a well-funded effort, the number of researchers working on AGI alignment is surprisingly low compared to those working on machine learning capabilities.
Current Alignment Techniques are Inadequate
Alignment techniques relying on human supervision will not scale to superhuman AGI systems. Current research is either disconnected from machine learning models or focused on short-term alignment solutions that are unlikely to work for superhuman systems.
Scalable Alignment is Solvable
Scalable alignment, which addresses the challenge of aligning superhuman AGI systems, is a solvable problem. It requires a shift towards treating alignment as a machine learning problem and conducting empirical research with advanced models. However, a more concerted effort is needed to match the gravity of the challenge.
Observing from afar, it’s easy to think there’s an abundance of people working on AGI safety. Everyone on your timeline is fretting about AI risk, and it seems like there is a well-funded EA-industrial-complex that has elevated this to their main issue. Maybe you’ve even developed a slight distaste for it all—it reminds you a bit too much of the woke and FDA bureaucrats, and Eliezer seems pretty crazy to you.
That’s what I used to think too, a couple of years ago. Then I got to see things more up close. And here’s the thing: nobody’s actually on the friggin’ ball on this one!
There’s far fewer people working on it than you might think. There are plausibly 100,000 ML capabilities researchers in the world (30,000 attended ICML alone) vs. 300 alignment researchers in the world, a factor of ~300:1. The scalable alignment team at OpenAI has all of ~7 people.
Barely anyone is going for the throat of solving the core difficulties of scalable alignment. Many of the people who are working on alignment are doing blue-sky theory, pretty disconnected from actual ML models. Most of the rest are doing work that’s vaguely related, hoping it will somehow be useful, or working on techniques that might work now but predictably fail to work for superhuman systems.
There’s no secret elite SEAL team coming to save the day. This is it. We’re not on track.