3min chapter

AI Safety Fundamentals: Alignment cover image

Deceptively Aligned Mesa-Optimizers: It’s Not Funny if I Have to Explain It

AI Safety Fundamentals: Alignment

CHAPTER

The Importance of Inner Alignment

The process will probably start with us running a gradient-descended loop with some kind of objective function. That will produce a meso-optimizer with some other, potentially different objective function. Outer alignment problems tend to sound like saucer as apprentice. We tell the AI to pick strawberries, but we forgot to include caveats and stop signals. The AI becomes super-intelligent and converts the whole world into strawberries so it can pick as many as possible. This is the scenario that a lot of AI alignment research focuses on. And we create the first true planning agent on purpose or by accident.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode