Astral Codex Ten Podcast cover image

Can This AI Save Teenage Spy Alex Rider From A Terrible Fate?

Astral Codex Ten Podcast

00:00

AI Alignment

AI scientists have debated these questions for years, usually as pure philosophy. We finally reached a point where AI's are smart enough for us to run the experiment directly. Redwood Research embarked on an ambitious project to test whether AI's could learn categories and reach alignment this way. A project that would require a dozen researchers, thousands of dollars of compute, and 4,300 Alex Ryder fan fiction stories. To test their AI alignment plan, Redwood needed an AI and a goal to align it to. For their AI, they chose GPT Neo, a popular and well-studied language model that completed text prompts.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app