AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Transferability of Language Models to Other Types of Models
In talking about the results, I don't recall you mentioning specific context-lengths. Can you elaborate on that? I would expect that would be a big part of what you accomplished is increasing the maximum context-length without unacceptable degradation and performance. So we have trained our own models up to length 8k, kind of a full training run. In some kind of smaller synthetic examples, we've gone up to 32k, 128k, and some follow-ups. We're working on trying to scale the full, take those sort of more synthetic examples, bring it to the full language modeling experience.