
Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Unveiling Anti-Grocking and Model Dynamics
This chapter explores the complexities of detecting deep neural network phases, particularly focusing on the novel phase of anti-grocking and its implications on overfitting. It argues for a reevaluation of traditional metrics by highlighting the correlation between hallucinations and model optimality, suggesting that these metrics may uncover deeper insights into model performance.
Transcript
Play full episode