NLP Highlights cover image

140 - Generative AI and Copyright, with Chris Callison-Burch

NLP Highlights

00:00

OpenAI's Cage About Disclosing Training Data

In general, proving that a model was trained on a certain kind of data is pretty difficult. And I think that's actually probably why OpenAI is now super cagey about saying exactly what they train on. So us academics were like, we demand like we for good science and very good reasons for that. But there's also some really interesting tests that some people have released on trying to figure out whether or not some items exist in the data set.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app