692: Lossless LLM Weight Compression: Run Huge Models on a Single GPU
Jun 30, 2023
07:39
forum Ask episode
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
Join Jon as he navigates listeners through the innovative SpQR approach—a cutting-edge, lossless LLM weight compression technique that harnesses the power of quantization. Tune in as Jon delves into the four steps behind this groundbreaking method in this week's episode.