AXRP - the AI X-risk Research Podcast cover image

24 - Superalignment with Jan Leike

AXRP - the AI X-risk Research Podcast

00:00

Neural Networks Generalize Across Languages

The hope is that maybe interpretability could tell you things about like it's got some kernel of lying in there but it only unlocks here. I think fundamentally this is also a really interesting machine learning problem just like how do you how do neural networks actually generalize outside of the IID setting and what are the mechanisms that work here? We don't know why or we don't have an answer to this question so far no-one does but it seems like a thing that's really important to understand yeahYeah, with causal modeling you're gonna need some sort of like causal modeling there from alleged beliefs to outputs right?" "You might you might need a lot more," he says. '

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app