AI Safety Fundamentals: Alignment cover image

Deceptively Aligned Mesa-Optimizers: It’s Not Funny if I Have to Explain It

AI Safety Fundamentals: Alignment

00:00

Introduction

Machine Alignment Monday is a weekly, off-beat look at AI. This week's topic: Deceptively aligned Mesa optimizes. We'll be focusing on Machine Alignment Mondays instead of Mantic and Model City Mondays. The show will still feature regular content from the rest of our team.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app