
140 - Generative AI and Copyright, with Chris Callison-Burch
NLP Highlights
OpenAI's Cage About Disclosing Training Data
In general, proving that a model was trained on a certain kind of data is pretty difficult. And I think that's actually probably why OpenAI is now super cagey about saying exactly what they train on. So us academics were like, we demand like we for good science and very good reasons for that. But there's also some really interesting tests that some people have released on trying to figure out whether or not some items exist in the data set.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.