
The Inside View
The goal of this podcast is to create a place where people discuss their inside views about existential risk from AI.
Latest episodes

5 snips
Aug 23, 2024 • 2h 16min
Owain Evans - AI Situational Awareness, Out-of-Context Reasoning
Owain Evans, an AI Alignment researcher at UC Berkeley’s Center for Human Compatible AI, dives deep into the intricacies of AI situational awareness. He discusses his recent papers addressing the creation of a dataset for large language models and their surprising capabilities in out-of-context reasoning. The conversation explores safety implications, deceptive alignment in AI, and the benchmark for evaluating LLM performance. Evans emphasizes the need for vigilant monitoring in AI training, touching on the challenges and future of model evaluations.

12 snips
May 17, 2024 • 2h 16min
[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)
Adam Gleave from Far AI and Nathan Labenz discuss vulnerabilities in GPT-4's APIs, accidental jailbreaking during fine-tuning, malicious code generation, private email discovery risks, ethical AI disclosure dilemmas, and navigating the ethical landscape of open source models. They explore exploiting vulnerabilities in superhuman Go AIs, challenges with GPT-4, and the transformative potential of AI.

16 snips
Apr 9, 2024 • 37min
Ethan Perez on Selecting Alignment Research Projects (ft. Mikita Balesni & Henry Sleight)
Mikita Balesni and Henry Sleight interview Ethan Perez on AI Alignment research projects, discussing problem-driven vs results-driven approaches, balancing intuition with empirical evidence, and the significance of addressing safety issues in AI. They also explore the importance of mentorship for young researchers, altering project trajectories based on feedback, and navigating project switches for promising results.

10 snips
Feb 20, 2024 • 1h 43min
Emil Wallner on Sora, Generative AI Startups and AI optimism
Emil Wallner discusses Sora, generative AI startups, and AI optimism. Topics include colorizing B&W pictures, Sora's capabilities, challenges, OpenAI's monopoly, hardware costs, diverse reactions to Sora, recursive self-improvement, and the future of AI models.

16 snips
Feb 12, 2024 • 52min
Evan Hubinger on Sleeper Agents, Deception and Responsible Scaling Policies
In this podcast, Evan Hubinger discusses the Sleeper Agents paper and its implications. He explores threat models of deceptive behavior and the challenges of removing it through safety training. The podcast also covers the concept of chain of thought in models, detecting deployment, and complex triggers. Additionally, it delves into deceptive instrumental alignment threat models and the role of alignment stress testing in AI safety.

Jan 27, 2024 • 33min
[Jan 2023] Jeffrey Ladish on AI Augmented Cyberwarfare and compute monitoring
Expert in AI augmented cyberwarfare and compute monitoring, Jeffrey Ladish, discusses the potential for automating cyberwarfare, advantages of AI in cyber attacks, current state and dangers of AI technology, current generation systems, limitations, and covert system penetration, as well as AI scaling and compute monitoring.

24 snips
Jan 22, 2024 • 1h 40min
Holly Elmore on pausing AI
Holly Elmore, an AI Pause Advocate discusses protests against AI advancements, motivations for pausing AGI, debate on AI pause in 2022, regulations, global warming vs. AI risk, China's pace, and advocating for a pause in AI development. The podcast explores navigating media attention, grassroots activism, risk tolerance, influences on public perception, ethical considerations, and algorithmic governance.

Jan 9, 2024 • 1h 4min
Podcast Retrospective and Next Steps
Dive into the evolution of a podcast focused on superintelligence and AI safety. Discover the challenges of finding engaging guests and how content styles have shifted over time. Explore the impact of video interviews on the AI research community, as creators balance audience feedback with the need for compelling content. The discussion reveals the dynamic debates within the AI risk community during a transformative period in AI discourse.

Sep 29, 2023 • 5min
Paul Christiano's views on "doom" (ft. Robert Miles)
Dive into a thought-provoking discussion on the future of humanity amidst advanced AI. The conversation navigates between three potential outcomes: a hopeful flourishing, a grim extinction, and a survival struggle. Emphasis is placed on the urgency of creating a decision-making framework to assess these scenarios. It's a captivating exploration of the optimism, risks, and the critical need for proactive measures.

4 snips
Sep 21, 2023 • 2h 5min
Neel Nanda on mechanistic interpretability, superposition and grokking
Neel Nanda, a researcher at Google DeepMind, discusses mechanistic interpretability in AI, induction heads in models, and his journey into alignment. He explores scalable oversight, the ambitious degree of interpretability in transformer architectures, and the capability of humans to understand complex models. The podcast also covers linear representations in neural networks, the concept of superposition in models and features, Terry Matt's mentorship program, and the importance of interpretability in AI systems.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.