The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Scaling Up Test-Time Compute with Latent Reasoning with Jonas Geiping - #723

157 snips
Mar 17, 2025
Jonas Geiping, a research group leader at the Ellis Institute and Max Planck Institute for Intelligent Systems, discusses innovative approaches to AI efficiency. He introduces a novel recurrent depth architecture that enables latent reasoning, allowing models to predict tokens with dynamic compute allocation based on difficulty. Geiping contrasts internal and verbalized reasoning in AI, explores challenges in scaling models, and highlights the architectural advantages that enhance performance in reasoning tasks. His insights pave the way for advancements in machine learning efficiency.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Maze-Solving Models

  • Jonas Geiping's team trained recurrent models to solve mazes, scaling from 13x13 to 8000x8000 pixels.
  • This demonstrated that models can learn algorithms and scale compute at test time.
INSIGHT

Recurrent Depth

  • Recurrent depth allows disentangling compute from output, unlike RNNs or Transformers.
  • The model repeats a small set of layers, achieving arbitrary depth and compute before generating output.
ANECDOTE

Scaling Challenges

  • Scaling from a 100M parameter prototype to 4,000 GPUs on Frontier was challenging.
  • Adapting to AMD, stabilizing Flash Attention, and custom DDP took considerable effort.
Get the Snipd Podcast app to discover more snips from this episode
Get the app