

IM 833: The Most Popular S3 Bucket Ever - AI Slop, Clankers, and Shrimp
16 snips Aug 21, 2025
Rich Skrenta, executive director of the Common Crawl Foundation and a trailblazer in internet archiving, shares his insights on the evolving landscape of AI. The discussion highlights the tensions between AI and content creators, emphasizing the balance of accessibility and compensation. Skrenta also explores the environmental impacts of data centers and the ethical dilemmas of AI in drug discovery. Additionally, they dive into the playful world of modern slang and culture, illustrating the quirky ways technology influences language and daily life.
AI Snips
Chapters
Books
Transcript
Episode notes
Common Crawl's Role In An Open Web
- Common Crawl provides an open, centralized web crawl that reduces redundant scraping and supports research and small projects.
- Opt-outs by publishers hurt smaller researchers and bias datasets away from important sources.
Cloudflare Can Shadow-Ban Public Sites
- CDNs and protection services can unintentionally block crawlers and shadow-ban sites, even government pages.
- Site owners may be unaware their content is inaccessible to archives and AI crawlers.
Hotels Fear Losing Discoverability To LLMs
- Rich described travel executives spending huge SEO budgets and now trying to be included in LLM answers for discoverability.
- Opting out of crawls risks losing future discoverability by AI agents and cost small publishers traffic.