Catalyst with Shayle Kann

Will inference move to the edge?

15 snips
Dec 18, 2025
Shayle is joined by Ben Lee, a Professor at the University of Pennsylvania and a visiting researcher at Google, focusing on AI systems. They delve into the shift from centralized AI compute to edge inference, essential for latency-sensitive applications like autonomous vehicles. Ben explains the differences between hyperscale, edge, and on-device computing, emphasizing why training will stay centralized. He also highlights the challenges and potential of local data centers, exploring implications for energy consumption and the future landscape of AI applications.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Edge Is The Middle Ground

  • Edge computing places powerful machines closer to users to cut internet latency and improve responsiveness.
  • It sits between hyperscale data centers and personal devices to deliver lower-latency compute.
INSIGHT

Training Stays Centralized

  • Training needs massive, closely connected GPU farms because models and datasets are enormous.
  • Distributed training requires frequent, expensive communication between GPUs that favors hyperscale data centers.
INSIGHT

Inference Could Drive Future Growth

  • Historically training dominated AI energy and compute, but inference is poised to grow rapidly as usage expands.
  • Rising inference demand will drive most future compute cost growth if adoption surges.
Get the Snipd Podcast app to discover more snips from this episode
Get the app