LessWrong (Curated & Popular) cover image

"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.

LessWrong (Curated & Popular)

00:00

How GPT2XL Modifies Forward Passes

In GPT2XL, each residual stream is 1,600 dimensional. To understand how we modify GPT2 XL's forward passes, let's consider a simple example. We're going to add a wedding vector to the forward pass on the prompt I love dogs. Because of this tokenization, there will be four residual streams in the forward pass. The unembed row has a final set of numerical values and spits out the words "this" and "the"

Play episode from 04:03
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app