

The world's largest open library dataset
Dec 1, 2020
In this discussion, Luke Chesser, Co-founder and Head of Product at Unsplash, and Timothy Carbone, Data Engineer at Unsplash, unveil the world's largest open library dataset, boasting over 2 million high-quality images. They explore innovative applications for machine learning and AI, the challenges of managing such vast data, and the importance of collaboration in the open data ecosystem. Their insights into the balance between sharing and business pragmatism reveal why this dataset is a game changer for researchers and developers alike.
AI Snips
Chapters
Transcript
Episode notes
Unsplash's Origin
- Unsplash's image repository started as a side project within a design marketplace.
- The repository transitioned into a full company due to its open-source nature and community contributions.
Dataset Content
- Unsplash's dataset provides links to high-quality images, metadata, and user interaction data.
- It doesn't contain the images themselves but offers Exif data, photographer details, geolocation, and more.
Unsplash Business Model
- Unsplash's business model relies on brand uploads and distribution, not direct dataset monetization.
- Their open data approach aligns with their ethos of sharing, benefiting contributors and the broader community.