3min chapter

MLOps.community  cover image

Data Engineering for ML // Chad Sanderson // Coffee Sessions #117

MLOps.community

CHAPTER

Data Quality and Data Validation - The Biggest Problem in Data Science

At Slack, first and foremost, we relied primarily on our structured logs. We had thrift Schema for our structured logs that were designed to go in the data warehouse. That was obviously integrated with CoderView, integrated with CI CD checks such that if you introduced a backwards incompatible change to the logs, the CI CD check would fail. You could, again, it could be overwritten, but you had to have a conversation with people before you did it, right? John: It's certainly the case that people like unintentionally, like through ignorance or whatever do these changes and stuff.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode