
The Fight Against AI Comes to a Foundational Data Set
Security, Spoken
00:00
The Conflict Surrounding Common Crawl and Copyright Issues
The chapter delves into the escalating conflict between media outlets and AI companies regarding data redactions in Common Crawl, triggering tensions over copyright issues and usage of the dataset for AI training. There is a significant impact on academic research and increased blocking of the web crawler CC Bot by major news and media sites as the dispute unfolds.
Transcript
Play full episode