
Unsupervised Learning with Jacob Effron Ep 81: Ex-OpenAI Researcher On Why He Left, His Honest AGI Timeline, & The Limits of Scaling RL
31 snips
Jan 29, 2026 Jerry Tworek, former VP of Research at OpenAI and architect of reasoning models and Codex, shares why scaling hits limits and why continual learning matters for true AGI. He discusses the constraints of pre-training and RL, the economics pushing labs toward similar strategies, his reasons for leaving OpenAI, and a near-term robotics outlook with big societal stakes.
AI Snips
Chapters
Books
Transcript
Episode notes
Scaling Delivers Predictable Gains
- Scaling pre-training and RL reliably improves models on what they are trained for.
- Generalization beyond training objectives remains the core unsolved question.
RL Needs Clear Signals
- Reinforcement learning succeeds where you can measure success reliably.
- Tasks with delayed, noisy, or weak feedback remain hard to train with RL.
Representation Learning Enables Generality
- Large-scale pretraining yields surprising, useful representations that generalize.
- A different model class could generalize better, but it's unclear what that looks like.



