CSPI Podcast cover image

AI Alignment as a Solvable Problem | Leopold Aschenbrenner & Richard Hanania

CSPI Podcast

00:00

The Truth Neuron: How It Determines Truth in Models

The hope for AI alignment is the same way we'll be able to use AIs to automate sort of AI capabilities research and build more powerful AI systems. So maybe we can use AI to help automate interpretability research. And in fact, there's a paper that came out from OpenAI where they did sort of embryonic steps towards automating interpretability.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app