MLOps.community

How Universal Resource Management Transforms AI Infrastructure Economics

20 snips
Jan 20, 2026
Wilder Lopes, a second-time founder and CEO of Ogre.run, dives into the pressing challenges of AI infrastructure. He discusses how many workloads can effectively run on idle CPUs instead of GPUs, addressing the often overlooked memory bottleneck. Wilder highlights innovations like CXL memory expansion and the need for better developer tooling for non-GPU hardware. He envisions a future with a diverse 'NeoCloud', emphasizing the importance of equipping developers with hardware knowledge and leveraging second-hand data centers for global AI advancements.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Leverage Existing Compute For Inference

  • Many inference workloads can run on existing CPUs and unused accelerators instead of waiting months for new GPUs.
  • Prioritize memory capacity over peak compute for inference to serve more workloads now.
ADVICE

Expand Memory Before Buying GPUs

  • Expand server memory (e.g., via CXL) to enable CPU-only inference for large models and datasets.
  • This avoids buying expensive GPU systems while providing enough memory to load models and data.
INSIGHT

Tooling Trumps Hardware Performance

  • Tooling and compatibility, not raw hardware, are the main barriers to using non-NVIDIA chips.
  • Developers stick with CUDA because ecosystem and abstractions make it faster to ship.
Get the Snipd Podcast app to discover more snips from this episode
Get the app