
19 - Mechanistic Interpretability with Neel Nanda
AXRP - the AI X-risk Research Podcast
00:00
Modular Addition Algorithm
modular addition is fundamentally about rotation around the unit circle early. You can compute this by just taking multiplication of pairs of the trig terms and like adding them using trigon entities. To get the cth logit you rotate backwards by two pi c of m to get a rotation by a plus b minus cYou have times two pi over n which kind of parameterize that rotation. It's also the sum mod n because it wraps around the circle if you get too big.
Transcript
Play full episode