

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723
157 snips Mar 17, 2025
Jonas Geiping, a research group leader at the Ellis Institute and Max Planck Institute for Intelligent Systems, discusses innovative approaches to AI efficiency. He introduces a novel recurrent depth architecture that enables latent reasoning, allowing models to predict tokens with dynamic compute allocation based on difficulty. Geiping contrasts internal and verbalized reasoning in AI, explores challenges in scaling models, and highlights the architectural advantages that enhance performance in reasoning tasks. His insights pave the way for advancements in machine learning efficiency.
AI Snips
Chapters
Transcript
Episode notes
Maze-Solving Models
- Jonas Geiping's team trained recurrent models to solve mazes, scaling from 13x13 to 8000x8000 pixels.
- This demonstrated that models can learn algorithms and scale compute at test time.
Recurrent Depth
- Recurrent depth allows disentangling compute from output, unlike RNNs or Transformers.
- The model repeats a small set of layers, achieving arbitrary depth and compute before generating output.
Scaling Challenges
- Scaling from a 100M parameter prototype to 4,000 GPUs on Frontier was challenging.
- Adapting to AMD, stabilizing Flash Attention, and custom DDP took considerable effort.