“0. CAST: Corrigibility as Singular Target” by Max Harms

Aug 7, 2024

Dive into the intriguing concept of corrigibility in AI, where the discussion pivots from confusion to clarity. Discover how this single property can be crucial for creating agents that are both effective and safe. Learn about innovative strategies for measuring and enhancing this quality in AI development. The podcast critiques the usual mix of goals and proposes a streamlined focus to improve outcomes. Prepare for a journey through the nuances of AI behavior and safety that could redefine future advancements.

Ask episode

Chapters

Transcript

Episode notes

Exploring Courage Ability as a Singular Target for AGI Development

00:00 • 20min