Practical AI

The world's largest open library dataset

Dec 1, 2020
In this discussion, Luke Chesser, Co-founder and Head of Product at Unsplash, and Timothy Carbone, Data Engineer at Unsplash, unveil the world's largest open library dataset, boasting over 2 million high-quality images. They explore innovative applications for machine learning and AI, the challenges of managing such vast data, and the importance of collaboration in the open data ecosystem. Their insights into the balance between sharing and business pragmatism reveal why this dataset is a game changer for researchers and developers alike.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Unsplash's Origin

  • Unsplash's image repository started as a side project within a design marketplace.
  • The repository transitioned into a full company due to its open-source nature and community contributions.
INSIGHT

Dataset Content

  • Unsplash's dataset provides links to high-quality images, metadata, and user interaction data.
  • It doesn't contain the images themselves but offers Exif data, photographer details, geolocation, and more.
INSIGHT

Unsplash Business Model

  • Unsplash's business model relies on brand uploads and distribution, not direct dataset monetization.
  • Their open data approach aligns with their ethos of sharing, benefiting contributors and the broader community.
Get the Snipd Podcast app to discover more snips from this episode
Get the app