The Stack Overflow Podcast cover image

The Stack Overflow Podcast

Tragedy of the (data) commons

Oct 25, 2024
30:36

The Data Provenance Initiative is a collective of volunteer AI researchers from around the world. They conduct large-scale audits of the massive datasets that power state-of-the-art AI models with a goal of mapping the landscape of AI training data to improve transparency, documentation, and informed use of data. Their Explorer tool allows users to filter and analyze the training datasets typically used by large language models.

Shayne and Robert are the authors of a new study called Consent in Crisis: The Rapid Decline of the AI Data Commons: the first large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training sets.

Connect with Shayne via his website.

Connect with Robert via his website or on LinkedIn

Stack Overflow user George Hawkins earned a Populist badge by explaining How to get base url in angular 5?.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode