3min chapter

Machine Learning Street Talk (MLST) cover image

Neel Nanda - Mechanistic Interpretability

Machine Learning Street Talk (MLST)

CHAPTER

How Language Models Predict the Next Token

Puzzling is this like old family of interactability techniques the idea is you think a model has represented something. You want to see if it's figured out that the picture is of a red thing versus a blue thing even though this isn't an explicit part of the output. There's also interesting stuff about probing but i should probably finish explaining the a fellow paper first before i get into that tangent please soYeah for like reason people are really excited about this paper it was recently an oral at Eichler and generally got a lot of hype was that it was just you train something predict the next token  in some work led by Wes Gurney we found neurons that detect things

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode