Model Evaluation for Extreme Risks

16 snips

May 13, 2023

The podcast highlights the significance of model evaluation in addressing extreme risks posed by AI systems. It discusses the importance of evaluating dangerous capabilities and assessing the propensity of models to cause harm. The chapters explore different aspects of model evaluation, including alignment evaluations and evaluating agency in AI systems. The podcast also discusses the limitations and hazards of model evaluation, risks related to conducting dangerous capability evaluations and sharing materials, and the importance of effective evaluations in AI safety and governance.

Ask episode

Chapters

Transcript

Episode notes

Introduction

00:00 • 3min

Model Evaluation for Extreme Risks in AI

02:45 • 15min

Model Evaluation for Extreme Risks in AI

17:40 • 22min

Challenges in Alignment Evaluation and Evaluating Agency in AI Systems

39:39 • 3min

Limitations and Hazards of Model Evaluation for Extreme Risks

42:25 • 3min

45:28 • 4min

Superficial Improvements to Model Safety and Hazards during Evaluation

49:34 • 2min

Model Evaluation for Extreme Risks in AI Safety and Governance

51:31 • 5min