
126 - Optimizing Continuous Prompts for Generation, with Lisa Li
NLP Highlights
00:00
How to Optimize Prefix Parameters for Training
The objective is the same as the fine-tuning objective they are both using cross-interview laws of key of y conditional x except that now the set of trainable parameters is different. This difference leads to like a large reduction in the parameters that we need to store because we are phrasing the language model parameters so we don't need to store them anymore and we only need to store p theta which is a small matrix cool yeah that makes sense. i think it's clear to me how exactly training works this one question i had related to some detail that i saw in your paper though you say that directly optimizing the prefix parameters do not does not work directly and you re-param
Transcript
Play full episode