
Eye On A.I. #299 Jacob Buckman: Why the Future of AI Won't Be Built on Transformers
16 snips
Nov 9, 2025 Jacob Buckman, an AI researcher and co-founder of Manifest AI, discusses rethinking AI memory and context. He explores the limitations of current transformers with long inputs, introducing the Power Retention model that offers efficient context management. Jacob explains how this new architecture enhances recall and reduces the need for frequent fine-tuning. He also touches on practical applications, such as retrieval agents and the potential for personal AIs that remember user context over time, while emphasizing the importance of open-sourcing this groundbreaking technology.
AI Snips
Chapters
Transcript
Episode notes
Transformers' Context Cost Is The Core Scaling Bottleneck
- Transformers pay quadratic cost as context grows, making long-input training intractable.
- Power Retention targets input-size scaling to enable tractable large-context models.
State-Space Models Offer A Dual Attention–Recurrence Form
- Mamba and other state-space models are retention models expressible as recurrent or attention forms.
- Power Retention adds an adjustable state-size axis so state can scale independently from parameters.
State Size, Not Just Parameters, Drives Performance
- Small state sizes in modern retention models limit performance at the point they become compute-optimal.
- Enlarging state size recovers performance and keeps efficiency across a wide flop range.
