The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

NOTE

Insights on Training Neural Networks

Training neural networks involves navigating numerous confounding factors and variables. Findings indicate differences in behavior between smaller and larger models, with wait time affecting loss curve stability in larger models. Additionally, shared weights between layers show effectiveness in smaller models but not in larger ones. It was observed that using non-parametric layer norms yielded better outcomes than parametric layer norms.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner