

Simon Mo
Co-creator and core maintainer of the open-source inference engine vLLM and co-founder of Infraact, with background in ML systems research focused on efficient LLM serving and runtime engineering.
Best podcasts with Simon Mo
Ranked by the Snipd community

86 snips
Jan 22, 2026 • 44min
Inferact: Building the Infrastructure That Runs Modern AI
Simon Mo and Woosuk Kwon, co-founders of Infraact, discuss their groundbreaking work on building a universal open-source inference layer for AI. They explore the evolution of vLLM from a UC Berkeley prototype to a key player in AI infrastructure. The duo delves into the complexities of running large AI models, tackling challenges like scheduling and memory management. They emphasize the importance of open-source in driving diversity and interoperability, while envisioning a future where efficient inference is as foundational as operating systems.

23 snips
Jan 22, 2026 • 44min
Inferact: Building the Infrastructure That Runs Modern AI
Simon Mo and Woosuk Kwon, co-founders of Infraact and core maintainers of the vLLM inference engine, dive into the complexities of modern AI infrastructure. They discuss how vLLM originated from Berkeley research to enhance large language model deployment. The duo highlights the challenges of scheduling and managing diverse model architectures for efficient inference. They also share their vision for a universal inference layer that supports any hardware or model, emphasizing the importance of open-source collaboration for innovation.


