AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Transfer Learning
We're training an 11 billion perameter model on about a trillion tokens of text. That takes a ton of computation. But once you've already done that, you sort of admertize the cost and it makes it much cheaper for a practitioner to use the results. The expensive part is kind of paid ahead of time when you do the pre training.