The Importance of Alignment in Language Models

There is a concern on the order of years is catastrophic risk from AI, where you have a system that recognizes incentives that differ from our own. An AI system is never better off pursuing its training objective if it gets turned off. If it has access to fewer resources or if it's less intelligent. And what we see with that paper is an early sign that like controllability may be a much more difficult problem than we might have thought.

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app