Wilder Lopes, a second-time founder and CEO of Ogre.run, dives into the pressing challenges of AI infrastructure. He discusses how many workloads can effectively run on idle CPUs instead of GPUs, addressing the often overlooked memory bottleneck. Wilder highlights innovations like CXL memory expansion and the need for better developer tooling for non-GPU hardware. He envisions a future with a diverse 'NeoCloud', emphasizing the importance of equipping developers with hardware knowledge and leveraging second-hand data centers for global AI advancements.
48:21
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
insights INSIGHT
Leverage Existing Compute For Inference
Many inference workloads can run on existing CPUs and unused accelerators instead of waiting months for new GPUs.
Prioritize memory capacity over peak compute for inference to serve more workloads now.
volunteer_activism ADVICE
Expand Memory Before Buying GPUs
Expand server memory (e.g., via CXL) to enable CPU-only inference for large models and datasets.
This avoids buying expensive GPU systems while providing enough memory to load models and data.
insights INSIGHT
Tooling Trumps Hardware Performance
Tooling and compatibility, not raw hardware, are the main barriers to using non-NVIDIA chips.
Developers stick with CUDA because ecosystem and abstractions make it faster to ship.
Get the Snipd Podcast app to discover more snips from this episode
Wilder Lopes is the CEO and Founder of Ogre.run, working on AI-driven dependency resolution and reproducible code execution across environments.How Universal Resource Management Transforms AI Infrastructure Economics // MLOps Podcast #357 with Wilder Lopes, CEO / Founder of Ogre.runJoin the Community:
// AbstractEnterprise organizations face a critical paradox in AI deployment: while 52% struggle to access needed GPU resources with 6-12 month waitlists, 83% of existing CPU capacity sits idle. This talk introduces an approach to AI infrastructure optimization through universal resource management that reshapes applications to run efficiently on any available hardware—CPUs, GPUs, or accelerators.We explore how code reshaping technology can unlock the untapped potential of enterprise computing infrastructure, enabling organizations to serve 2-3x more workloads while dramatically reducing dependency on scarce GPU resources. The presentation demonstrates why CPUs often outperform GPUs for memory-intensive AI workloads, offering superior cost-effectiveness and immediate availability without architectural complexity.// BioWilder Lopes is a second-time founder, developer, and research engineer focused on building practical infrastructure for developers. He is currently building Ogre.run, an AI agent designed to solve code reproducibility.Ogre enables developers to package source code into fully reproducible environments in seconds. Unlike traditional tools that require extensive manual setup, Ogre uses AI to analyze codebases and automatically generate the artifacts needed to make code run reliably on any machine. The result is faster development workflows and applications that work out of the box, anywhere.// Related LinksWebsite: https://ogre.runhttps://lopes.aihttps://substack.com/@wilderlopeshttps://youtu.be/YCWkUub5x8c?si=7RPKqRhu0Uf9LTql
~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Wilder on LinkedIn: /wilderlopes/Timestamps:[00:00] Secondhand Data Centers Challenges[00:27] AI Hardware Optimization Debate[03:40] LLMs on Older Hardware[07:15] CXL Tradeoffs[12:04] LLM on CPU Constraints[17:07] Leveraging Existing Hardware[22:31] Inference Chips Overview[27:57] Fundamental Innovation in AI[30:22] GPU CPU Combinations[40:19] AI Hardware Challenges[43:21] AI Perception Divide[47:25] Wrap up