The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

00:00

Insights on Training Neural Networks

Training neural networks involves navigating numerous confounding factors and variables. Findings indicate differences in behavior between smaller and larger models, with wait time affecting loss curve stability in larger models. Additionally, shared weights between layers show effectiveness in smaller models but not in larger ones. It was observed that using non-parametric layer norms yielded better outcomes than parametric layer norms.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app