Insight on Model Training and Generalization in Infinite Width Neural Networks

Episode 32: Jamie Simon, UC Berkeley: On theoretical principles for how neural networks learn and generalize

Generally Intelligent

NOTE

Insight on Model Training and Generalization in Infinite Width Neural Networks

Training a neural network on data with noisy labels results in slower loss drop and longer training time compared to training on clean data. This phenomenon is captured accurately by infinite width neural networks, as they still exhibit slower convergence even in the infinite width limit. The alignment between the neural tangent kernel and the target function plays a crucial role in generalization, with better alignment leading to improved generalization performance. Understanding the alignment through the study of kernels provides insights into the generalization capabilities of neural networks. In the infinite width limit, generalization is quantitatively determined by the alignment between the kernel and the target function, highlighting the importance of this alignment in model generalization. Convolutional networks outperform fully connected networks on image data due to their better alignment with the target function, emphasizing the significance of alignment in neural network performance.

00:00

Transcript

Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.