AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Induction Heads in a Two Layer Model
Redwood's research have been doing this work with causal scrubbing they're trying to build these rigorous technique. They found a circuit that appears in two layer intentionally models that looks at the current token and it says has this token appeared in the past if yes then let's assume the thing that came after it is going to come nextYeah so you could imagine that like if you got a token James and you want to figure out what comes next you're like is this a piece about James Bond or just some random dude called James? So we want the induction head to attend from James to the first occurrence of Bond yeah because Bond is preceded by James which is a copy of the current token