Neural Search Talks β€” Zeta Alpha cover image

ColBERT + ColBERTv2: late interaction at a reasonable inference cost

Neural Search Talks β€” Zeta Alpha

00:00

Densfertil's Knowledge Distillation Technique

Densfertil has improved training techniques and just to play it as is, but they do some things to speed up or essentially to lower storage costs. They have now embeddings for all the documents and the collection, all the terms and all the documents specifically. Now they have a bunch of centroids. And so they store each of these term embeddings as basically the ID of their nearest centroid plus a heavily quantized delta vector. So this has clear storage space improvements because you're storing the centroid's ones and You're storing the delta vectors which are only one or two bits per dimension. Yeah. Okay.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app