The Gradient: Perspectives on AI cover image

Sewon Min: The Science of Natural Language

The Gradient: Perspectives on AI

00:00

How to Train a Language Model Like This

In this work you also discussed two challenges to training a model like this. So one of them being that full corpus retrieval during training is going to be very expensive. The other that learning to predict arbitrary length phrases without a decoder is an on trivial. Can you tell me a little bit about well perhaps maybe first you could elaborate on those problems a little bit but then could you also tell me a bit about how you went about solving them? Yeah so for the first problem training a retrieval problem is in general harder than training a language model because you need to keep the retrieval purpose updated and it's hard to keep them updated. However if we can construct a batch in a clever way then it

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app