Changelog Master Feed cover image

Evaluating models without test data (Practical AI #194)

Changelog Master Feed

00:00

Is Weight Watcher GPU Optimized?

The current model runs a singular value decomposition on each layer. It could take anywhere from a couple minutes to an hour to beam if you're trying to run it on GPT and you have a thousand layers, it's going to take some time. If you just have a few layers in your model and you're training like a small model, it's very, very fast. You don't need the GPU, you need to open run it. So that's sort of the takeaway.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app