
19 - Mechanistic Interpretability with Neel Nanda
AXRP - the AI X-risk Research Podcast
The Basic Algorithm of a Transformer
The algorithm has three steps one of them is mapping a and b to the sign w a. It's actually really easy because you don't need to learn the general function sign you see to learn sign on 113 memorized values yep note that I studied mod 113 though the same algorithm seems to transfer to everything else we've checked. Lawrence made a great diagram of the solid paper people should get a look at it cool yeah. The hard part of the algorithm is multiplying together the like trig term for a in the trig term for b to get the a plus b rotation, he says.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.