AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Relationship Between an 8k Context Longevity and a Larger Context Least
Is there a relationship or complex relationship between an 8k context length relative to a small model is equivalent to a larger context length for a larger model? Or on the other hand would be like, okay, your technique got you to 8k. If you apply that technique in the context of a March larger model, you're still at 8k. Is there scaling of the techniques ability to increase the context length with the complexity of the model? You apply this technique on a small, it gets you to 8K. If you're applying it on this massive scale model, does it get you to 64k? They're kind of orthogonal a little bit.