Eye On A.I. cover image

#281 Leon Song: The Research Driving Next-Gen Open-Source Models (Together AI)

Eye On A.I.

00:00

How Speculative Decoding Boosts Inference

  • Speculative decoding uses a small speculator model and a larger verifier to speed inference without losing accuracy.
  • High acceptance rates for the speculator are critical to realize meaningful latency gains.
Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app