AI Safety Fundamentals: Alignment cover image

Deceptively Aligned Mesa-Optimizers: It’s Not Funny if I Have to Explain It

AI Safety Fundamentals: Alignment

00:00

AlphaGo: A Meso-Optimizer

AlphaGo is kind of a meso-optimizer. You could approximate it as a gradient descent loop creating a good go move optimizer, but this would only be an approximation. DeepMind hard-coded some parts of AlphaGo, then gradient-descended other parts. Its objective function is win games of Go, which is hard-coded and pretty clear. Whether or not you choose to call it a Meso- Optimizer, it's not a very scary one.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app