Super Data Science: ML & AI Podcast with Jon Krohn

913: LLM Pre-Training and Post-Training 101, with Julien Launay

50 snips
Aug 12, 2025
Julien Launay, Co-founder and CEO of AdaptiveML, shares insights on how his company simplifies reinforcement learning for data science teams, enhancing AI accessibility in businesses. He explores his tech journey from Minecraft to developing advanced AI tools. Key discussions include the importance of reward functions in AI integration, the technical nuances of reinforcement learning algorithms, and the challenges of data quality. Julien also reveals plans to democratize AI, fostering innovation across various industries by making advanced models more widely available.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Pre-Training Builds Broad Knowledge

  • Pre-training exposes models to vast web-scale text so they learn general patterns and knowledge.
  • Julien Launay says this step predicts next tokens but leaves models unwieldy for direct interaction.
INSIGHT

Post-Training Sharpens Behavior

  • Post-training uses targeted data and feedback to make models interactive for tasks like chat.
  • Julien Launay highlights reinforcement learning as the key method to polish outputs.
INSIGHT

Three Reward Sources Scale RL

  • Use humans (RLHF), execution tests (RLEF), and AI reviewers (RLAIF) as complementary reward sources.
  • Julien Launay notes AI feedback and verifiable tests scale far beyond manual labeling.
Get the Snipd Podcast app to discover more snips from this episode
Get the app