AI Safety Fundamentals: Alignment cover image

ML Systems Will Have Weird Failure Modes

AI Safety Fundamentals: Alignment

00:00

Introduction

Delve into a detailed thought experiment on the potential emergent capabilities of future ML systems, examining the implications of assuming an ML agent as a perfect optimizer with intrinsic and extrinsic reward functions. Explore skepticism towards emergent behavior, clash with neural network ontology, and strategies for deriving actionable insights from such scenarios.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app