

The Data Stack Show
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Episodes
Mentioned books

Aug 17, 2022 • 54min
100: Data Quality Is Relative to Purpose with James Campbell of Superconductive
Highlights from this week’s conversation include:James’ role at Great Expectations (2:33)What Great Expectations does (5:49)How Great Expectations approaches data quality (7:01)Why a data engineer should use Great Expectations (16:41)Defining “data quality” (19:16)Translating expectations from one domain to the other (27:00)Community around Great Expectations (30:59)The user experience (33:41)Something exciting on the horizon (40:27)Interacting with marketers in a non-technical way (43:57)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Aug 12, 2022 • 4min
The PRQL: What’s the Hardest Part About Data Quality?
Eric and Kostas preview their upcoming conversation with James Campbell at Superconductive.

5 snips
Aug 10, 2022 • 1h 13min
99: State of the Data Lakehouse with Vinoth Chandar of Apache Hudi
Highlights from this week’s conversation include:Vinoth’s background and career journey (3:08)Defining “data lakehouse” (5:10)Databricks versus lake houses (13:37)The services a lakehouse needs (17:37)How to communicate technical details (26:55)Onehouse’s product vision (31:41)Lakehouse performance versus BigQuery solutions (36:44)How to deliver customer experience equally (40:17)How to start building a lakehouse (44:00)Big tech’s effect on smaller lakehouses (55:33)Skipping the data warehouse (1:04:39)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Aug 5, 2022 • 5min
The PRQL: Does Lakehouse Architecture Really Mean the End of the Data Warehouse and Data Lake As We Know It?
In this bonus episode, Eric and Kostas preview their upcoming conversation with Vinoth Chandar of Apache Hudi.

Aug 3, 2022 • 1h 2min
98: Category Theory and the Mathematical Foundation of the Technologies We Use with Eric Daimler of Conexus
Highlights from this week’s conversation include:Eric’s background and career journey (3:30)Presenting to people without knowledge of AI (11:04)Why math was chosen over AI (19:03)From compilers to databases (25:42)The contribution of category theory (30:09)The Connexus customer experience (37:45)The primary user of Connexus (46:33)Interacting with 300,000 databases (51:07)When Connexus begins to add value (54:02)The best way to learn this mathematical approach (55:46)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Jul 29, 2022 • 4min
The PRQL: Farm to Table Abstract Mathematics
Eric and Kostas preview their upcoming conversation with Eric Damlier of Conexus AI.

Jul 27, 2022 • 54min
97: How To Build an Organization-Empowering Data Team with Emilie Schario of Amplify Partners
Highlights from this week’s conversation include:Emilie’s background and career journey (3:00)Hypergrowth at GitLab (5:23)Being close to the money in data (9:50)Big things taken from GitLab to Netlify (13:00)Defining “data organization” (17:53)The first roles you should hire for (22:06)Defining “analytics engineer” (23:44)One role to bridge different needs (27:26)Why data analysts are needed (30:51)How to avoid a kitchen sink of data (40:20)Data engineer archetype (45:48)Data roles crossing over (48:09)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Jul 22, 2022 • 4min
The PRQL: If You Were Building a Data Team What Would Your First Hire Be?
Eric and Kostas preview their upcoming conversation with Emilie Schario from Amplify Partners.

Jul 20, 2022 • 55min
96: How To Collect and Leverage Data From the Physical World with Prateek Joshi of Plutoshift
Highlights from this week’s conversation include:Prateek’s background and career journey (2:10)The lack of advanced data tools for the physical world (4:55)Dealing with data from the physical world (10:53)Stocks in the physical world (14:20)What it takes to execute this kind of project (19:05)Challenges around this infrastructure (25:56)ML tools that are useful in this environment (31:55)Physical instrumentation and environmental interaction (36:43)Current adoption of physical instrumentation (42:50)Data’s responsibility in sustainability (45:56)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Jul 15, 2022 • 3min
The PRQL: Collecting Data in the Physical World
Eric and Kostas preview their upcoming conversation with Prateek Joshi.