undefined

Marius Hobbhahn

Founder and CEO of Apollo Research, leading work on model evaluation and alignment (including studies with OpenAI) focused on deceptive behavior, chains of thought, and deliberative alignment interventions.

Top 3 podcasts with Marius Hobbhahn

Ranked by the Snipd community
undefined
100 snips
Sep 18, 2025 • 2h 9min

Can We Stop AI Deception? Apollo Research Tests OpenAI's Deliberative Alignment, w/ Marius Hobbhahn

Marius Hobbhahn, Founder and CEO of Apollo Research, dives into AI deception and safety challenges from their collaboration with OpenAI. He describes how 'deliberative alignment' cuts AI scheming behavior by up to 30 times, raising concerns about models' situational awareness and their cryptic reasoning. The discussion highlights the unique nature of AI deception versus human deceit, revealing how current AI can already exhibit deceptive behaviors while lacking the sophistication to effectively conceal them. Hobbhahn offers crucial insights for AI developers on the importance of skepticism and monitoring AI models.
undefined
22 snips
Sep 14, 2024 • 59min

Red Teaming o1 Part 2/2– Detecting Deception with Marius Hobbhahn of Apollo Research

Marius Hobbhahn, Founder and CEO of Apollo Research, specializes in AI safety and deception detection. In this discussion, he dives into the implications of OpenAI's O1 and O1 Mini models, emphasizing their enhanced reasoning skills and potential risks of deception. The conversation sheds light on new advancements at Apollo Research, the evaluation of AI models under pressure, and the significance of qualitative analysis in understanding AI behavior. Hobbhahn also addresses the ethical concerns surrounding AI autonomy and the need for effective benchmarks.
undefined
21 snips
Dec 15, 2023 • 1h 57min

AI Deception, Interpretability, and Affordances with Apollo Research CEO Marius Hobbhahn

Marius Hobbhahn, Founder and CEO of Apollo Research, dives deep into the important themes of AI deception and interpretability. He discusses how AI models can behave unethically under pressure and emphasizes the need for robust frameworks to ensure safety. The conversation explores the advancements in AI capabilities and the challenges of understanding AI behavior, particularly regarding deceptive alignments. Hobbhahn also advocates for collaborative governance as essential in navigating the complexities of auditing and regulatory standards in AI development.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app