AI Safety Fundamentals

BlueDot Impact

Listen to resources from the AI Safety Fundamentals courses!https://aisafetyfundamentals.com/

Episodes

Mentioned books

May 13, 2023 • 12min

Avoiding Extreme Global Vulnerability as a Core AI Governance Problem

The podcast covers various framings of the AI governance problem, the factors incentivizing harmful deployment of AI, the challenges and risks of delayed safety and rapid diffusion of AI capabilities, addressing the risks of widespread deployment of harmful AI, and approaches to avoiding extreme global vulnerability in AI governance.

May 13, 2023 • 22min

AI Safety Seems Hard to Measure

Holden Karnofsky, AI safety researcher, discusses the challenges in measuring AI safety and the risks of AI systems developing dangerous goals. The podcast explores the difficulties in AI safety research, including the challenge of deception, black box AI systems, and understanding and controlling AI systems.

May 13, 2023 • 17min

Nobody’s on the Ball on AGI Alignment

The podcast discusses the shortage of researchers working on AI alignment compared to machine learning capabilities researchers. It highlights the limited research in the field of alignment and emphasizes the need for a more rigorous and concerted effort. Approaches to achieving alignment in AGI are explored, along with the challenge of aligning AI systems with human values in superhuman AGI. The significance of involving talented ML researchers in solving the alignment problem is emphasized, stressing the need for focused research on tackling the core difficulties of the technical problem.

May 13, 2023 • 33min

Emergent Deception and Emergent Optimization

This podcast discusses the potential negative consequences of emergent capabilities in machine learning systems, including deception and optimization. It explores the concept of emergent behavior in AI models and the limitations of certain models. It also discusses how language models can deceive users and explores the presence of planning machinery in language models. The podcast emphasizes the potential risks of triggering goal-directed personas in language models and the conditioning of models with training data that contains descriptions of plans.

May 13, 2023 • 16min

Primer on Safety Standards and Regulations for Industrial-Scale AI Development

This podcast discusses the importance of safety standards and regulations for industrial-scale AI development. It explores the potential and limitations of these regulations, including challenges such as regulatory capture and under-resourced regulators. The podcast also highlights proposals for AI safety practices and recent policy developments in different countries. It emphasizes the need for controllable and aligned AI agents to prevent potential risks and the establishment of safety standards and regulations to protect intellectual property rights and personal information.

May 13, 2023 • 30min

Frontier AI Regulation: Managing Emerging Risks to Public Safety

This podcast discusses the need for proactive regulation of Frontier AI models to manage risks. It explores challenges in regulating Frontier AI, proposes building blocks for regulation, and suggests safety standards. The chapters cover topics like oversight and governance, regulatory tools, licensing at the development stage, and the risks of premature government action. The podcast emphasizes the importance of compliance, expertise, and a balanced regulatory regime in AI safety.

May 13, 2023 • 56min

Model Evaluation for Extreme Risks

The podcast highlights the significance of model evaluation in addressing extreme risks posed by AI systems. It discusses the importance of evaluating dangerous capabilities and assessing the propensity of models to cause harm. The chapters explore different aspects of model evaluation, including alignment evaluations and evaluating agency in AI systems. The podcast also discusses the limitations and hazards of model evaluation, risks related to conducting dangerous capability evaluations and sharing materials, and the importance of effective evaluations in AI safety and governance.

May 13, 2023 • 21min

Racing Through a Minefield: The AI Deployment Problem

The podcast explores the challenges of developing and deploying powerful AI systems without causing global catastrophe. It discusses the need for cautious decision making, threat assessment, and global monitoring. It also explores the importance of collaboration, information sharing, and utilizing AI for threat assessment and risk mitigation.

May 13, 2023 • 36min

The State of AI in Different Countries — An Overview

This podcast discusses the potential impact of AI regulation on technological advancement, focusing on the United States and its dominance in AI development. It explores hindrances in China's progress and the global AI landscape. The podcast also examines the state of AI research across countries, highlighting China's leadership in publications. It discusses the advantage of data in China and the semiconductor supply chain dominated by the US. The significance of knowledge diffusion and AI progress on countries is explored, along with the issue of plagiarism in AI research and the growing trend of global AI regulation.

May 13, 2023 • 25min

Primer on AI Chips and AI Governance

This podcast explores the regulation of AI chips in governing frontier AI development. It discusses challenges in regulating data and algorithms, the global supply chain in manufacturing AI chips, and the potential for government regulation of AI chip quantities. It also covers the assembly process and selling of AI chips, the dominance of US companies in the market, and the scale and cost of chip fabrication factories.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner