The Data Exchange with Ben Lorica cover image

LLMs on CPUs, Period

The Data Exchange with Ben Lorica

00:00

Fine-Tuning and Spartification of 'Llama Seven Billion'

The chapter explores the process of reducing the size of the pre-trained model 'llama seven billion' through spartification and quantization, discussing the trade-offs between accuracy and speed. It highlights the benefits of sparsity and quantization in achieving both smaller model size and faster performance.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app