Papers Read on AI

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Aug 9, 2024
Ask episode
Chapters
Transcript
Episode notes