AXRP - the AI X-risk Research Podcast cover image

19 - Mechanistic Interpretability with Neel Nanda

AXRP - the AI X-risk Research Podcast

CHAPTER

The Losses Depend on the Log Prop of the Correct Next Token

The losses depend on what like the actual next token was. It's not all log problems it's specifically the correct next log problem yep i'm gonna handle that deeply about this to me  it just seems more intuitive than other ways of looking at things. But a caveat to this is that specific circuits may make claims about what these should be, but by taking them into account both could achieve similar results.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner