AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Scale of a Transformer
I was wondering if you could just give us a sense of what are the sort of scale of models that are out there. People hear about like bird and GPT-2 and now of course we're getting flooded with GPT-3 things. Models range from 50 million parameters to tens of billions of parameters at the top end. In practice some of the larger models, it's unclear how you would even use them say standard GPU hardware but scale has been a big part of kind of transformers usage.