
LLMs on CPUs, Period
The Data Exchange with Ben Lorica
00:00
Fine-tuning Models: Quantization and Sparsification
A discussion on the process of fine-tuning models, focusing on tools for quantization and sparsification to make models smaller and more efficient for CPUs. The chapter explores hardware requirements, reduction techniques, and the empirical nature of achieving high levels of sparsity and accuracy in machine learning models.
Transcript
Play full episode