LessWrong (Curated & Popular) cover image

"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.

LessWrong (Curated & Popular)

00:00

The Effects of Prompting on GPT-2 XL

We run unmodified GPT-2 XL on each sentence tokenisation, but with weddings prepended to the tokenisation. For example, if the original sentence is, title, recent trends, we compare perplexity ratios for the following conditions. If both interventions behave similarly, that's evidence that in certain contexts, activation edition is somehow equivalent to injecting in extra tokens in quotes. We're surprised this technique works at all, let alone so well. To head off confusion, we know that a prompt engineer wouldn't prepend weddings in order to encourage wedding-related generations. That would be stupid.

Play episode from 01:17:35
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app