NLP Highlights cover image

140 - Generative AI and Copyright, with Chris Callison-Burch

NLP Highlights

CHAPTER

OpenAI's Cage About Disclosing Training Data

In general, proving that a model was trained on a certain kind of data is pretty difficult. And I think that's actually probably why OpenAI is now super cagey about saying exactly what they train on. So us academics were like, we demand like we for good science and very good reasons for that. But there's also some really interesting tests that some people have released on trying to figure out whether or not some items exist in the data set.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner