LessWrong (Curated & Popular) cover image

"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.

LessWrong (Curated & Popular)

00:00

The Effects of the Weddings Vector on Perplexity

In addition to measuring how the steering vector affects perplexity on the shipping essay, we also validated on Wikipedia descriptions of Macedonia and a recipe for vegan banana bread. Their perplexity curves had the same shape as the shipping curve. Next, we want to understand which coefficients are appropriate to use when adding in activations. We sweep over coefficients in the range negative 1 to 4 for layers 6 and 16.

Play episode from 01:12:27
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app