
126 - Optimizing Continuous Prompts for Generation, with Lisa Li
NLP Highlights
How to Optimize Prefix Parameters for Training
The objective is the same as the fine-tuning objective they are both using cross-interview laws of key of y conditional x except that now the set of trainable parameters is different. This difference leads to like a large reduction in the parameters that we need to store because we are phrasing the language model parameters so we don't need to store them anymore and we only need to store p theta which is a small matrix cool yeah that makes sense. i think it's clear to me how exactly training works this one question i had related to some detail that i saw in your paper though you say that directly optimizing the prefix parameters do not does not work directly and you re-param
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.