Changelog Master Feed cover image

Evaluating models without test data (Practical AI #194)

Changelog Master Feed

CHAPTER

Is Weight Watcher GPU Optimized?

The current model runs a singular value decomposition on each layer. It could take anywhere from a couple minutes to an hour to beam if you're trying to run it on GPT and you have a thousand layers, it's going to take some time. If you just have a few layers in your model and you're training like a small model, it's very, very fast. You don't need the GPU, you need to open run it. So that's sort of the takeaway.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner