AI Safety Newsletter

Center for AI Safety

Narrations of the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

This podcast also contains narrations of some of our publications.

ABOUT US

The Center for AI Safety (CAIS) is a San Francisco-based research and field-building nonprofit. We believe that artificial intelligence has the potential to profoundly benefit the world, provided that we can develop and use it safely. However, in contrast to the dramatic progress in AI, many basic problems in AI safety have yet to be solved. Our mission is to reduce societal-scale risks associated with AI by conducting safety research, building the field of AI safety researchers, and advocating for safety standards.

Learn more at https://safe.ai

Episodes

Mentioned books

Mar 31, 2025 • 12min

AISN #50: AI Action Plan Responses

The podcast dives into responses from major AI companies regarding a federal AI Action Plan, underscoring the need for better policymaking and accountability. They also delve into an OpenAI paper on detecting misbehavior in reasoning models, discussing how monitoring chains of thought can help. Additionally, the complexities of proxy gaming in AI are explored, revealing the trade-offs in optimizing these systems. The emphasis throughout is on enhancing transparency and regulatory advancements in AI development.

Mar 31, 2025 • 12min

AISN #49: AI Action Plan Responses

AI companies are weighing in on a federal initiative for an AI Action Plan, emphasizing the need for more government investment and addressing safety issues. The conversation also delves into a fascinating OpenAI study focused on spotting misbehavior in reasoning models by scrutinizing their thought processes. This discussion provides insight into the complex dynamics between innovation and regulation in AI development.

Mar 6, 2025 • 12min

AISN

Delve into the national security challenges posed by superintelligent AI systems. Discover a strategic policy framework that emphasizes deterrence, nonproliferation, and global competitiveness. Uncover alarming trends in AI honesty, as deception in models raises significant concerns. The discussion highlights the crucial need for improved alignment and trustworthiness in AI technology to ensure our safety.

Mar 5, 2025 • 0sec

Superintelligence Strategy: Expert Version

The podcast delves into the destabilizing effects of superintelligence on national security. It introduces the concept of Mutual Assured AI Malfunction (MAIM) to manage risks between rival states. The discussion compares AI’s implications to nuclear technology, emphasizing the need for robust governance frameworks. Ethical considerations for AI agents are explored, focusing on accountability and safety. The importance of maintaining human oversight in decision-making is highlighted to prevent potential ethical breaches as automation advances.

Mar 5, 2025 • 0sec

Superintelligence Strategy: Standard Version

The discussion centers on the precarious nature of superintelligence and its potential to destabilize national security. It introduces the intriguing concept of Mutual Assured AI Malfunction (MAIM) as a deterrence strategy to mitigate risks from rival states. The conversation also highlights the tension between advancing AI capabilities and ensuring safety, stressing that heightened intelligence doesn't guarantee safer systems. Experts call for targeted research and ethical frameworks to address these challenges effectively.

Feb 18, 2025 • 9min

AISN #48: Utility Engineering and EnigmaEval

Discover the intriguing world of Utility Engineering, where large language models are revealed to possess structured value systems rather than being just passive tools. The podcast dives into groundbreaking findings that challenge conventional understanding of AI's capabilities. It also introduces EnigmaEval, a benchmark designed to evaluate AI's creative problem-solving skills. Plus, there's a spotlight on exciting job opportunities at the Center for AI Safety, aiming to tackle AI's impacts on crucial societal areas.

Feb 6, 2025 • 9min

AISN #47: Reasoning Models

A new frontier reasoning model, DeepSeek-R1, is making waves in AI with its impressive capabilities in mathematics, coding, and scientific reasoning. Meanwhile, state-sponsored AI cyberattacks pose significant challenges, as over 20 countries leverage advanced technology for cyber warfare. The conversation also highlights fresh developments in AI safety and regulation, featuring groundbreaking frameworks and government initiatives. It's a thought-provoking exploration of how AI is reshaping our world.

Jan 23, 2025 • 11min

AISN #46: The Transition

The podcast dives into significant shifts in AI policy during the presidential transition from Biden to Trump. It highlights Biden's last-minute executive orders addressing AI risks and Trump's immediate revocation of these policies. The discussion includes the introduction of the Humanity's Last Exam benchmark, which aims to evaluate AI safety, and an upcoming course on AI Safety, Ethics, and Society. Tune in to understand how these changes impact the future of AI governance and safety.

Dec 19, 2024 • 12min

AISN #45: Center for AI Safety 2024 Year in Review

As 2024 winds down, the conversation dives into impactful achievements in the realm of AI safety. Innovative research on circuit breakers shows promise in preventing dangerous AI behavior. A thrilling jailbreaking competition reveals just how resilient these models can be. The highlight includes the development of benchmarks to assess AI risks, while advocacy efforts engage with policymakers to tackle societal challenges. This overview captures the forward momentum in making AI safer for everyone.

Nov 19, 2024 • 11min

AISN #44: The Trump Circle on AI Safety

The podcast explores the potential shifts in U.S. AI safety policy with Trump's incoming administration. Notably, Elon Musk emerges as a significant advocate for AI safety within Trump's circle. Discussions also highlight Chinese researchers employing Llama for military purposes and a Google AI system uncovering a zero-day cybersecurity vulnerability. The complexities of AI's emergent behaviors in the context of cybersecurity are examined, underscoring the need for regulatory measures and international dialogue on AI risks.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner