LessWrong (Curated & Popular) cover image

"Steering GPT-2-XL by adding an activation vector" by TurnTrout et al.

LessWrong (Curated & Popular)

00:00

Editing Models With Tasker Arithmetic

Activation editions can be continuously weighted, while prompts are discrete. If you want the model to talk even more about weddings, you don't need to contort the prompt just to increase the injection coefficient. We think that activation editions will generalize prompts by allowing weights on token embeddings and improve prompt engineering. In a future post we will use this to highlight interesting high-level facts about LLMs.

Play episode from 01:38:44
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app