AI Safety Fundamentals cover image

AI Safety Fundamentals

Listen to resources from the AI Safety Fundamentals courses!https://aisafetyfundamentals.com/

Latest episodes

Jan 4, 2025 • 1h 2min

Constitutional AI Harmlessness from AI Feedback

This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.A podcast by BlueDot Impact.Learn more on the AI Safety Fundamentals website.

Jan 4, 2025 • 12min

Introduction to Mechanistic Interpretability

Our introduction introduces common mech interp concepts, to prepare you for the rest of this session's resources.Original text: https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/Author(s): Sarah Hastings-WoodhouseA podcast by BlueDot Impact.Learn more on the AI Safety Fundamentals website.

Jan 2, 2025 • 40min

If-Then Commitments for AI Risk Reduction

Holden Karnofsky, a visiting scholar at the Carnegie Endowment for International Peace, delves into his innovative 'If-Then' commitments for managing AI risks. He outlines how these structured responses can ensure proactive safety measures without stifling innovation. The discussion highlights the importance of timely interventions as AI technology evolves, ensuring developments stay safe and beneficial. Karnofsky also touches on the challenges of implementing these commitments and the necessity of regulatory compliance across sectors.

Jan 2, 2025 • 11min

This is How AI Will Transform How Science Gets Done

This article by Eric Schmidt, former CEO of Google, explains existing use cases for AI in the scientific community and outlines ways that sufficiently advanced, narrow AI models might transform scientific discovery in the near future. As you read, pay close attention to the existing case studies he describes.Original text: https://www.technologyreview.com/2023/07/05/1075865/eric-schmidt-ai-will-transform-science/ Author(s): Eric SchmidtA podcast by BlueDot Impact.Learn more on the AI Safety Fundamentals website.

Dec 30, 2024 • 56min

Open-Sourcing Highly Capable Foundation Models: An Evaluation of Risks, Benefits, and Alternative Methods for Pursuing Open-Source Objectives

The discussion tackles the double-edged sword of releasing powerful foundation models. While transparency can fuel innovation, the potential for misuse by malicious actors is alarming. Cyberattacks, the development of biological weapons, and disinformation loom large. The conversation also emphasizes the necessity for responsible AI governance, advocating for structured access and staged releases as safer alternatives to total openness. Legal liabilities and the implications of regulation are scrutinized, highlighting the critical need for thorough risk assessments.

Dec 30, 2024 • 41min

So You Want to be a Policy Entrepreneur?

Discover the dynamic world of policy entrepreneurs and their role in driving innovation. Uncover strategies like problem framing and coalition building that empower them to tackle global challenges, especially climate change. Learn how collaborative networks help share vital knowledge and foster legislative support. Historical examples illustrate the impact of these leaders in areas like California’s stem cell research. Dive into the ongoing fight against violence in conflict, showcasing the critical need for advocacy and transformative policies.

Dec 30, 2024 • 26min

Considerations for Governing Open Foundation Models

Discover the debate surrounding open foundation models and their potential to drive innovation and competition. The discussion highlights the ethical dilemmas of open versus closed models and the need for thoughtful policy design. Examining the risks, such as disinformation, the speakers argue that evidence for these dangers is limited. They advocate for policies that consider the unique characteristics of open models to prevent stifling development while promoting transparency and reducing monopolistic power in AI.

May 22, 2024 • 36min

Driving U.S. Innovation in Artificial Intelligence: A Roadmap for Artificial Intelligence Policy in the United States Senate

The podcast discusses the U.S. Senate's AI policy roadmap, including AI Ready Data, National AI Research Resource, AI safety, and election security. It explores funding and collaboration in AI applications for defense and national security, as well as addressing legal and regulatory gaps in AI systems. Policy recommendations cover export controls, security mechanisms, and collaboration with international partners.

May 20, 2024 • 40min

The AI Triad and What It Means for National Security Strategy

Ben Buchanan, author of the AI Triad framework, discusses the inputs powering machine learning: algorithms, data, and compute. The podcast explores the impact of these components on national security strategy, the disparities between machine learning and traditional programming, and the application of machine learning in national security, robotics, and AI advancements.

May 20, 2024 • 46min

Societal Adaptation to Advanced AI

Authors Jamie Bernardi and Gabriel discuss societal adaptation to advanced AI systems in a paper, emphasizing the need for adaptive strategies and resilience. Topics include managing AI risks, interventions, loss of control to AI decision-makers, adaptive strategies, and responses to AI threat models.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner