Latent Space: The AI Engineer Podcast

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

489 snips
Jan 2, 2026
Kevin Wang, an undergraduate researcher at Princeton, and Ishaan Javali, his co-author, discuss their groundbreaking work on scaling reinforcement learning networks to 1,000 layers deep, a feat previously deemed impossible. They dive into the shift from traditional reward maximization to self-supervised learning methods, highlighting architectural breakthroughs like residual connections. The duo also explores efficiency trade-offs, data collection techniques using JAX, and the implications for robotics, positioning their approach as a radical shift in reinforcement learning objectives.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
00:00 / 00:00

Undergrad Seminar Sparked The Project

  • Kevin started the project as an undergraduate in a Princeton IW seminar and led the collaboration.
  • The project grew as classmates like Ishaan and others joined and contributed.
00:00 / 00:00

Skepticism Turned Into A Calculated Bet

  • Benjamin was initially skeptical because past attempts at deep RL failed.
  • He still backed the experiment because infrastructure improvements lowered the cost of trying.
00:00 / 00:00

Self-Supervision Reframes RL Scaling

  • Self-supervised RL replaces noisy value regression with representation learning via classification.
  • This objective scales like language/vision pretraining and enables much deeper networks.
Get the Snipd Podcast app to discover more snips from this episode
Get the app