Data Engineering Podcast cover image

How Shopify Is Building Their Production Data Warehouse Using DBT

Data Engineering Podcast

00:00

Scaling in the Data Warehouse

I'm wondering if you have explored anything along the lines of creating some specific contracts and semantic meaning into the ways that the column names are formatted to help reduce any sort of ambiguity as to what a given column might mean in context. Another thing that I'm interested in understanding how you're managing things like master data management for being able to establish canonical references for a particular metric or the specific meaning of a given scalar value. And just some of the challenges that you're facing on that front. Would mostly be aiming at having some sort of consistency check where like unused data sets get de-scheduled after 90 days of inactivity, among others.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app