AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Trade-Offs Between Retrieval and Modeling
A lot of the large language model training regime does single epoch training, meaning that you just go through your training data once, not multiple times. We don't totally understand how all these abilities generalize and how the teaching about the limitations generalizes. So that's definitely an interesting topic for the research. When you talk about bringing in citations, it seems like an alternative instead of having the model ahead of time read the entire internet is to let it retrieve things on the fly. What are your thoughts on the trade-offs between models that use retrieval versus models that have everything trained into their weights?