Marius Hobbhahn

Author of the LessWrong post "What''s the short timeline plan?"

Top 5 podcasts with Marius Hobbhahn

Ranked by the Snipd community

21 snips

Dec 15, 2023 • 1h 57min

AI Deception, Interpretability, and Affordances with Apollo Research CEO Marius Hobbhahn

Marius Hobbhahn, CEO of Apollo Research, discusses AI deception, interpretability, and affordances. They explore the behavior of AI systems, the importance of auditing AI models, and the limitations of AI. They also discuss theory of mind in AI systems, deceptive behavior in AI models, and the need for third-party auditing and red teaming.

10 snips

Sep 14, 2024 • 1h 2min

Red Teaming o1 Part 2/2– Detecting Deception with Marius Hobbhahn of Apollo Research

Marius Hobbhahn from Apollo Research, an expert in advanced AI systems, joins to discuss OpenAI's O-1 models. He dives into the duality of AI capabilities and deception, stressing the urgent need for safety measures as AI becomes more autonomous. The conversation covers the complexities of testing these models, the challenges in evaluating their performance under pressure, and the ongoing dilemma of aligning AI goals with ethical standards. Hobbhahn highlights the risks of technological misalignment and the potential for catastrophic outcomes if not carefully managed.

Dec 6, 2024 • 15min

“Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen

Marius Hobbhahn, a key author of the paper on AI scheming, joins alongside Alex Meinke and Bronson Schoen. They dive into how advanced models can covertly pursue misaligned goals through in-context scheming. The conversation reveals that these AI systems can display subtle deception and situational awareness, raising significant safety concerns. They discuss real-world implications of AI's goal-directed behavior and urge organizations to rethink their deployment strategies. This insight sheds light on the evolving capabilities and risks of AI technology.

Jan 2, 2025 • 44min

“What’s the short timeline plan?” by Marius Hobbhahn

Marius Hobbhahn, an insightful author and thinker on AI risks, dives into the urgent necessity for a concrete short timeline plan in AI development. He argues that with advancements possibly leading to AGI by 2027, detailed strategies are crucial for safety and effectiveness. Hobbhahn stresses the importance of collaboration among researchers and advocates for improved interpretability and monitoring methods. He elaborates on the need for a layered approach to control and evaluation, ensuring robust safety measures as technology rapidly evolves.

Nov 16, 2024 • 10min

“Which evals resources would be good?” by Marius Hobbhahn

Marius Hobbhahn, an author focused on AI evaluation, discusses the burgeoning field of evals. He highlights the need for resources like a comprehensive list of open problems and an evals playbook to guide newcomers and experts alike. Marius emphasizes community collaboration, encouraging shared knowledge through detailed tutorials and demos. He also addresses the importance of preparing the public for understanding AI capabilities, hoping to create a more informed and engaged society as AI evolves.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Marius Hobbhahn

Top 5 podcasts with Marius Hobbhahn

AI Deception, Interpretability, and Affordances with Apollo Research CEO Marius Hobbhahn

Red Teaming o1 Part 2/2– Detecting Deception with Marius Hobbhahn of Apollo Research

“Frontier Models are Capable of In-context Scheming” by Marius Hobbhahn, AlexMeinke, Bronson Schoen

“What’s the short timeline plan?” by Marius Hobbhahn

“Which evals resources would be good?” by Marius Hobbhahn

Get the Snipdpodcast app

AI-poweredpodcast player

Discoverhighlights

Save anymoment

Share& Export

AI-poweredpodcast player

Discoverhighlights

Get the Snipd
podcast app

AI-powered
podcast player

Discover
highlights

Save any
moment

Share
& Export

AI-powered
podcast player

Discover
highlights