AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How Do Induction Heads Work?
There's a fun thing in the paper where you can go play around with the attention pattern and I found a couple of heads like that in dpj and figure out what's up with those is on my long term to do list okay. One of the problems my pocket different problem sequence so they want to go try thatI'm very excited to see what you find anywayYeah and the translation head is also an induction head and my guess is it's just the same fundamental algorithm of map things too late in space look for matchesLook at the thing immediately after match and that the model has just learned how to do something sensible here. Is is it that the same head does French to English English