AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Limits of Transparency in Language Models
The GPT-4 paper says that they believe it's a matter of safety to not disclose anything about the training data. In order to be able to use this safely, we need to know what's in it. The way they're getting this air's-outs fluency is by taking everything they can possibly get their hands on. That strategy leads to these data sets that are probably too big to document thoroughly.