
Buck Shlegeris
CEO of Redwood Research, focusing on AI control mechanisms to mitigate misalignment risks.
Top 5 podcasts with Buck Shlegeris
Ranked by the Snipd community

69 snips
Apr 4, 2025 • 2h 16min
#214 – Buck Shlegeris on controlling AI that wants to take over – so we can use it anyway
Buck Shlegeris, CEO of Redwood Research, dives into the crucial topic of AI control mechanisms to mitigate risks of misalignment. He shares insights on developing safety protocols for advanced AIs that could potentially act against human interests. Shlegeris emphasizes actionable strategies that aren't as complex as they seem. The discussion highlights the urgent need for robust safeguards in AI deployment and the ethical implications of misaligned systems. He also explores the challenges of monitoring AI, underscoring a proactive approach to ensure safety and trust.

9 snips
Jun 8, 2022 • 1h 16min
Taking pleasure in being wrong (with Buck Shlegeris)
Buck Shlegeris, CTO of Redwood Research, shares insights on the complexities of forming true beliefs and embracing the joy of being wrong. He discusses how we should approach certainty in our claims and the value of engaging in rigorous debates to refine our understanding. The conversation delves into effective learning strategies, particularly the benefits of small-scale practice in machine learning. Shlegeris also highlights the challenges in AI alignment, emphasizing the critical need to align AI systems with human values for future safety.

Dec 13, 2023 • 17min
AF - AI Control: Improving Safety Despite Intentional Subversion by Buck Shlegeris
Buck Shlegeris discusses AI Control and safety methods for preventing catastrophic failures caused by colluding AI instances. They explore securing code submissions with advanced AI models, address high-stakes AI safety challenges, and emphasize the importance of scalable oversight techniques. The podcast delves into preventing intentional subversion in AI systems and outlines future directions for enhancing AI control and safety protocols.

Apr 18, 2025 • 41min
Highlights: #214 – Buck Shlegeris on controlling AI that wants to take over – so we can use it anyway
In this enlightening discussion, Buck Shlegeris, CEO of Redwood Research and a pioneer in AI control, dives into the urgent need to manage misaligned AIs. He explains innovative techniques to detect and neutralize harmful behaviors, emphasizing the critical importance of proactive monitoring. The conversation also touches on the tension between corporate ambition and AI safety, exploring whether alignment strategies can truly keep us safe. Shlegeris advocates for small, focused teams to drive change from within the industry.

Nov 4, 2023 • 9min
AF - Untrusted smart models and trusted dumb models by Buck Shlegeris
Buck Shlegeris, writer and contributor to The AI Alignment Forum, discusses the importance of capability evaluations in determining trustworthy AI models. He suggests segregating models into smart untrusted and dumb trusted categories for safety. The podcast delves into the challenges of using trusted models for safety research and the need for monitoring and oversight in AI integration.