
LLMs on CPUs, Period
The Data Exchange with Ben Lorica
00:00
Introduction
The host welcomes a guest from Neural Magic to discuss the use of LLMs on CPUs and how model sparsity can accelerate open source LLMs, focusing on deployment and inference.
Transcript
Play full episode