The Trade-Offs Between Retrieval and Modeling

A lot of the large language model training regime does single epoch training, meaning that you just go through your training data once, not multiple times. We don't totally understand how all these abilities generalize and how the teaching about the limitations generalizes. So that's definitely an interesting topic for the research. When you talk about bringing in citations, it seems like an alternative instead of having the model ahead of time read the entire internet is to let it retrieve things on the fly. What are your thoughts on the trade-offs between models that use retrieval versus models that have everything trained into their weights?

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app