The New Stack Podcast

Confronting AI’s Next Big Challenge: Inference Compute

8 snips
Aug 6, 2025
In a dynamic conversation, Sid Sheth, Founder and CEO of d-Matrix, dives into the complexities of AI inference. He emphasizes that inference isn't a one-size-fits-all challenge and requires specialized hardware for different needs. Sid introduces d-Matrix's innovative modular platform, Corsair, designed to minimize memory-compute distance for faster performance. He also explores the parallels between human learning and AI deployment, and stresses the necessity for tailored infrastructure to enhance enterprise AI integration.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Inference as Knowledge Application

  • AI inference is when a trained model applies learned knowledge to solve tasks, similar to humans applying skills post-education.
  • Inference usually requires less retraining and focuses on monetizing existing knowledge efficiently.
INSIGHT

Inference Challenges and Agentic AI

  • Major inference challenges involve integrating AI models into workflows and improving user adoption through better interfaces.
  • Agentic AI will increase compute needs and emphasize low latency for machines communicating with machines rapidly.
INSIGHT

Heterogeneous Inference Hardware Needed

  • Inference workloads vary widely; a one-size-fits-all hardware approach no longer works unlike monolithic training setups.
  • The inference hardware landscape will become heterogeneous, with specialized devices serving different performance profiles and needs.
Get the Snipd Podcast app to discover more snips from this episode
Get the app