The Gradient: Perspectives on AI cover image

Kyunghyun Cho: Neural Machine Translation, Language, and Doing Good Science

The Gradient: Perspectives on AI

CHAPTER

Pre-Trained Models - What Does It Do?

There are two other recent papers of yours that I think it might be appropriate to kind of pair together just because they both speak to ideas of fine-tuning models and then also sort of intervening on performance and behavior. So one is mix out this effective regularization to fine-tune large-scale models. And then the other was adaptive fusion where you identify the issues in existing approaches to transfer learning. Could you tell me a little bit about maybe the suite of problems around sort of fine- tuning and enetervining on pre-trained models? Right. What does pre-training actually do? If you think about stochastic gradient descent as our main optimizer, we always

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner