Analyzing Indirect Object Identification in GPT-2-Small

The chapter provides a detailed examination of a circuit in GPT-2-Small that implements Indirect Object Identification (IOI), focusing on how attention heads interact and move information between tokens in a sentence. It discusses techniques like path patching to differentiate between direct and indirect effects of attention heads, exploring the impact on logit differences and identifying critical pathways in the model's computation. The chapter also analyzes scaling issues, methodologies for studying attention head outputs, and the influence of name move-aheads on logit probabilities.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app