ThursdAI - The top AI news from the past week

📆 Oct 9, 2025 — Dev Day’s Agent Era, Samsung’s 7M TRM Shock, Ling‑1T at 1T, Grok Video goes NSFW, and Serverless RL arrives

49 snips
Oct 10, 2025
This week’s guest is Eric Provencher, founder of RepoPrompt and creator of RepoBench, who specializes in benchmarking for AI-assisted coding. They dive into the exciting developments from OpenAI’s Dev Day, including innovative coding agents and tooling. Eric explains how RepoBench evaluates coding performance, showcasing its community-driven benchmarks. They also explore Samsung’s competitive 7M recursive model and the implications of Serverless RL, making reinforcement learning more accessible for developers.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Platform Power Of ChatGPT Apps

  • Dev Day showed OpenAI turning ChatGPT into an app and agent platform with massive reach.
  • That platform effect amplifies distribution and places OpenAI as a likely gatekeeper for many AI experiences.
ADVICE

Benchmark Code-Editing, Not Just Generation

  • Use RepoPrompt/RepoBench to evaluate models' real-world code editing ability, not just code generation.
  • Run multiple stochastic runs and use medians to capture variance in agent edits.
INSIGHT

Serverless RL Lowers RL Barrier

  • Serverless RL (Weights & Biases + CoreWeave + OpenPipe) lowers infra barriers to practical reinforcement learning.
  • Managed RL can simplify reward/verification tooling and scale experiments with fewer ops costs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app