AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Regularize a Loss Function in a Dep Network
There are many, many levers that folks are using during the training process. The bat sizes and learning rates and cyclical learning rates would be even more impactful than was originally believed. Once you get into one of butternekand onto a wide valley, it's very difficult to come back from that. And then critical lerning fior which is what we were talking about earlier, tells you that once you get into one or two white minima - people talk about white minima on so fort flat valleys.