Data Skeptic cover image

The Limits of NLP

Data Skeptic

00:00

Transfer Learning

We're training an 11 billion perameter model on about a trillion tokens of text. That takes a ton of computation. But once you've already done that, you sort of admertize the cost and it makes it much cheaper for a practitioner to use the results. The expensive part is kind of paid ahead of time when you do the pre training.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app