
126 - Optimizing Continuous Prompts for Generation, with Lisa Li
NLP Highlights
The Differences Between Overparameterization and Adapter Tuning
Overparameterization would mean a lot more parameter tuning than if you like double the size of the so here's the story when we try to overparameterize the number of trainable parameters at training time it would be probably equal to adapter tuning. However by doubling the number of uh by doubling the prefix lengths it does not lead to such a overparameterization effect as if we are using MLP okay cool thanks that's good to know right and i guess another point to add is that if you just decide to directly use the like without any overParameterization just try to directly optimize p theta it would still work it just requires a very different set of parameters and
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.