Inference is now the biggest challenge in enterprise AI. In this episode of Eye on AI, Craig Smith speaks with Nick Pandher, VP of Product at Cirrascale, about why AI is shifting from model training to inference at scale. As AI moves into production, enterprises are prioritizing performance, latency, reliability, and cost efficiency over raw compute. The conversation covers the rise of inference-first infrastructure, the limits of hyperscalers, the emergence of neoclouds, and how agentic AI is driving always-on inference workloads. Nick also explains how inference-optimized hardware and serverless AI platforms are shaping the future of enterprise AI deployment.
If you are deploying AI in production, this episode explains why inference is the real frontier.
Stay Updated:
Craig Smith on X: https://x.com/craigss
Eye on A.I. on X: https://x.com/EyeOn_AI
(00:00) Preview
(00:50) Introduction to Cirrascale and AI inference
(03:04) What makes Cirrascale a neocloud
(04:42) Why AI shifted from training to inference
(06:58) Private inference and enterprise security needs
(08:13) Hyperscalers vs neoclouds for AI workloads
(10:22) Performance metrics that matter in inference
(13:29) Hardware choices and inference accelerators
(20:04) Real enterprise AI use cases and automation
(23:59) Hybrid AI, regulated industries, and compliance
(26:43) Proof of value before AI pilots
(31:18) White-glove AI infrastructure vs self-serve cloud
(33:32) Qualcomm partnership and inference-first AI
(41:52) Edge-to-cloud inference and agentic workflows
(49:20) Why AI pilots fail and how enterprises succeed