The Story of Coherent Gradients

It's not clear that high gradient coherence really should lead to better generalization, either. I think the story of coherent gradients is very simple. It makes sense, but it's probably not the full story. But we don't know if that's harmful either. Could there be a way that that's beneficial? It's possible.

Play episode from 42:14

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app