The Gradient: Perspectives on AI cover image

Kyunghyun Cho: Neural Machine Translation, Language, and Doing Good Science

The Gradient: Perspectives on AI

00:00

Pre-Trained Models - What Does It Do?

There are two other recent papers of yours that I think it might be appropriate to kind of pair together just because they both speak to ideas of fine-tuning models and then also sort of intervening on performance and behavior. So one is mix out this effective regularization to fine-tune large-scale models. And then the other was adaptive fusion where you identify the issues in existing approaches to transfer learning. Could you tell me a little bit about maybe the suite of problems around sort of fine- tuning and enetervining on pre-trained models? Right. What does pre-training actually do? If you think about stochastic gradient descent as our main optimizer, we always

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app