Buck Shlegeris

CEO of Redwood Research, focusing on AI control mechanisms to mitigate misalignment risks.

Top 5 podcasts with Buck Shlegeris

Ranked by the Snipd community

May 4, 2025 • 2h 26min

AI Control: Using Untrusted Systems Safely with Buck Shlegeris of Redwood Research, from the 80,000 Hours Podcast

In a compelling conversation, Buck Shlegeris, CEO of Redwood Research and a leader in AI safety, discusses innovative strategies for controlling untrusted AI systems. He delves into techniques like always-on auditing and honeypotting, shedding light on how to monitor AI behaviors effectively. Buck emphasizes the risks of AI misalignment and the importance of proactive measures to ensure safe outcomes. He also explores the ethical implications of AI evolution, calling for collaborative approaches to enhance safety in this rapidly developing field.

Apr 4, 2025 • 2h 16min

#214 – Buck Shlegeris on controlling AI that wants to take over – so we can use it anyway

Buck Shlegeris, CEO of Redwood Research, dives into the crucial topic of AI control mechanisms to mitigate risks of misalignment. He shares insights on developing safety protocols for advanced AIs that could potentially act against human interests. Shlegeris emphasizes actionable strategies that aren't as complex as they seem. The discussion highlights the urgent need for robust safeguards in AI deployment and the ethical implications of misaligned systems. He also explores the challenges of monitoring AI, underscoring a proactive approach to ensure safety and trust.

Apr 18, 2025 • 41min

Highlights: #214 – Buck Shlegeris on controlling AI that wants to take over – so we can use it anyway

In this enlightening discussion, Buck Shlegeris, CEO of Redwood Research and a pioneer in AI control, dives into the urgent need to manage misaligned AIs. He explains innovative techniques to detect and neutralize harmful behaviors, emphasizing the critical importance of proactive monitoring. The conversation also touches on the tension between corporate ambition and AI safety, exploring whether alignment strategies can truly keep us safe. Shlegeris advocates for small, focused teams to drive change from within the industry.

Jun 8, 2022 • 1h 16min

Taking pleasure in being wrong (with Buck Shlegeris)

Buck Shlegeris, CTO of Redwood Research, shares insights on the complexities of forming true beliefs and embracing the joy of being wrong. He discusses how we should approach certainty in our claims and the value of engaging in rigorous debates to refine our understanding. The conversation delves into effective learning strategies, particularly the benefits of small-scale practice in machine learning. Shlegeris also highlights the challenges in AI alignment, emphasizing the critical need to align AI systems with human values for future safety.

Dec 13, 2023 • 17min

AF - AI Control: Improving Safety Despite Intentional Subversion by Buck Shlegeris

Buck Shlegeris discusses AI Control and safety methods for preventing catastrophic failures caused by colluding AI instances. They explore securing code submissions with advanced AI models, address high-stakes AI safety challenges, and emphasize the importance of scalable oversight techniques. The podcast delves into preventing intentional subversion in AI systems and outlines future directions for enhancing AI control and safety protocols.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner