Neural Search Talks — Zeta Alpha cover image

ColBERT + ColBERTv2: late interaction at a reasonable inference cost

Neural Search Talks — Zeta Alpha

00:00

Densfertil's Knowledge Distillation Technique

Densfertil has improved training techniques and just to play it as is, but they do some things to speed up or essentially to lower storage costs. They have now embeddings for all the documents and the collection, all the terms and all the documents specifically. Now they have a bunch of centroids. And so they store each of these term embeddings as basically the ID of their nearest centroid plus a heavily quantized delta vector. So this has clear storage space improvements because you're storing the centroid's ones and You're storing the delta vectors which are only one or two bits per dimension. Yeah. Okay.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner