The Gradient Podcast - Part 2

We're pretty optimistic that it has at least some bearing on large language models. Even if you do want to train these models, models at this scale can easily be trained on a single GPU and relatively small data sets. So for us, without small supercomputers to run GP3, does that seem like it'll help? That was really interesting for me. We need these papers.

Play episode from 44:53

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app