Machine Learning Archives - Software Engineering Daily cover image

Data Science at Spotify with Boxun Zhang

Machine Learning Archives - Software Engineering Daily

00:00

How to Reduce the Time Expended Cleaning Data?

Data scientists often spend up to 80 % of their time cleaning data. Spotify has more than forty patabite of data. Every day we have about 30 terrabites of data injested from kafka. And our own pipe lines also generate another 400 terrabite ofdata within our hadub cluster.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app