Snipd AI
Thomas Scialom, Led Llama2 and now Llama3, discusses pre-training with synthetic data, scaling laws, RLHF vs instruction tuning, and the use of pure synthetic data. Llama3 was trained on 15T tokens, leveraging Llama2 as a classifier for pre-training data mix. Exploring the significance of synthetic data generation models and challenges in optimizing AI models with human feedback in reinforcement learning.
Read more

Podcast summary created with Snipd AI

Quick takeaways

  • Llama3-405B model surpasses GPT-4 benchmarks with 15T tokens training.
  • Synthetic data crucial to Llama3 post-training success.

Deep dives

AI Advancements in AGI Research and Development

Working on Llama 3, the team has made significant progress in advancing artificial general intelligence (AGI) technology. With plans for Llama 4 underway, the focus shifts towards implementing agent-based behaviors and evolving models to achieve more advanced capabilities.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode