The world's largest open library dataset

Dec 1, 2020

Guest

Timothy Carbone

Guest

Luke Chesser

In this discussion, Luke Chesser, Co-founder and Head of Product at Unsplash, and Timothy Carbone, Data Engineer at Unsplash, unveil the world's largest open library dataset, boasting over 2 million high-quality images. They explore innovative applications for machine learning and AI, the challenges of managing such vast data, and the importance of collaboration in the open data ecosystem. Their insights into the balance between sharing and business pragmatism reveal why this dataset is a game changer for researchers and developers alike.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Unsplash's Origin

Unsplash's image repository started as a side project within a design marketplace.
The repository transitioned into a full company due to its open-source nature and community contributions.

INSIGHT

Dataset Content

Unsplash's dataset provides links to high-quality images, metadata, and user interaction data.
It doesn't contain the images themselves but offers Exif data, photographer details, geolocation, and more.

INSIGHT

Unsplash Business Model

Unsplash's business model relies on brand uploads and distribution, not direct dataset monetization.
Their open data approach aligns with their ethos of sharing, benefiting contributors and the broader community.

Get the Snipd Podcast app to discover more snips from this episode

Get the app