Neural Search Talks — Zeta Alpha cover image

Transformer Memory as a Differentiable Search Index: memorizing thousands of random doc ids works!?

Neural Search Talks — Zeta Alpha

00:00

Exactly. Is It a Small Eight Layer Birth Model?

In practical terms, they use, I think, ten clusters, right? They start with ten clusters, and those, you know, give them a zero to nine for the first digit. But then they cluster recursively within each of these. So it kind of naturally translates into decimal thing, right? At least that's how I understand. Okay. Now let's talk about this differentiation between unsupervised pre-training and which leads to zero-shot sort of testing of the model versus fine-tuning on this. Yeah. We've talked about two of these training tasks, I guess we'll call them, right?

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app