LessWrong (Curated & Popular)

“Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro” by ryan_greenblatt

Sep 4, 2025
The discussion dives into the challenges of scaling reinforcement learning (RL) due to low-quality environments. Arguments emerge about the potential benefits of better environments in enhancing AI capabilities. There's skepticism regarding whether recent advancements truly stem from improvements, with some suggesting AIs might soon create their own environments. The conversation also touches on the economics involved in developing RL environments, debating the impact of budget and labor on their effectiveness and the potential algorithmic advancements that could follow.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Progress Is A Sum Of Many Advances

  • Ryan Greenblatt argues recent progress already priced in improved RL environments and other advances.
  • Multiple seemingly huge advances combine into a smooth trend rather than a single break.
INSIGHT

Few True Trend Breakers Exist

  • Greenblatt sees only a few true trend-breaking breakthroughs in recent decades.
  • He lists deep learning at scale and generative pretraining as the main candidates.
ADVICE

Don't Overweight One RL Scale-Up

  • Don't assume RL scale-up alone will cause a massive above-trend jump in 2025.
  • Expect steady but not explosively super-exponential gains from RL and reasoning models.
Get the Snipd Podcast app to discover more snips from this episode
Get the app