The Future of Hyperparameter Search

The cheaper model training becomes the more we will be able to afford testing on different checkpoints. I think mixture of experts in general has become really popular. We've trained a small model that works great so if, you know, for for those who are on the call right now. It's there and ready to go. Playing with optimizing architectures for H100s. And getting the details right is really hard. Every like every winning thing on Kaggle is an ensemble. Like, honestly, it's a really good approach when you're out of other ideas.

Play episode from 19:47

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app