

The Data Stack Show
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Episodes
Mentioned books

Feb 14, 2024 • 1h 7min
177: AI-Based Data Cleaning, Data Labelling, and Data Enrichment with LLMs Featuring Rishabh Bhargava of refuel
Rishabh Bhargava, an expert in AI-based data cleaning, data labelling, and data enrichment with LLMs, discusses topics like the evolution of AI and LLMs, implementing use cases and cost considerations, categorizing search queries, benchmarking and evaluation, utilizing customer support ticket data, understanding confidence scores, and training models with human feedback.

Feb 12, 2024 • 4min
The PRQL: Exploring the Evolution of AI and ML with Rishabh Bhargava of refuel
Rishabh Bhargava, AI and ML expert, discusses his background in data and AI. The hosts also explore refuel's mission to make reliable data accessible to teams and businesses.

Feb 7, 2024 • 53min
176: The Fundamentals of Event-Driven Orchestration and How Generative AI Is Shaping Its Future with Viren Baraiya of orkes.io
Highlights from this week’s conversation include:Viren’s background in data (0:39)Evolution of Orchestration (1:52)AI Orchestration (3:00)Understanding Conductor and orkes (6:26)Event-Driven Orchestration (8:10)Viren’s Transition to Founder (12:27)Non-Technical Aspects of Being a Founder (15:50)Democratizing AI for Developers (18:16)The evolution of microservices orchestration (21:56)Challenges in appealing to the 99% developer group (24:32)Value of orchestration for developers (30:31)Role of orchestrators in managing faults (37:37)The intersection of AI and orchestration (40:27)Evolution of AI (44:04)Thriving in AI Environment (47:58)Final thoughts and takeaways (51:25)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Feb 5, 2024 • 4min
The PRQL: The Evolution of Application Orchestration Featuring Viren Baraiya of orkes.io
In this bonus episode, Eric and Kostas preview their upcoming conversation with Viren Baraiya of orkes.io. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

17 snips
Jan 31, 2024 • 1h 19min
175: The Parts, Pieces, and Future of Composable Data Systems, Featuring Wes McKinney, Pedro Pedreira, Chris Riccomini, and Ryan Blue
Data systems experts Wes McKinney, Pedro Pedreira, Chris Riccomini, and Ryan Blue discuss the concept of composable data systems, the challenges and incentives for composable components, specialization and modularity in data workloads, and the efficiency and common layers in data management systems. They also explore the evolution of data system composability, exciting new projects in data systems, and the challenges of standardizing APIs.

Jan 29, 2024 • 5min
The PRQL: Exploring the Evolution, Challenges, and Benefits of Composable Data Stacks Featuring Wes McKinney, Pedro Pedreira, Chris Riccomini, and Ryan Blue
In this bonus episode, Eric and Kostas preview their upcoming discussion with a panel of experts as Wes McKinney (Co-Founder, Voltron), Pedro Pedreira Software Engineer, Meta), Chris Riccomini (Seed Investor, various startups), and Ryan Blue (Co-Founder and CEO, Tabular) join the show. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

Jan 24, 2024 • 58min
174: Does Your Data Stack Need a Semantic Layer? Featuring Artyom Keydunov of Cube Dev
Highlights from this week’s conversation include:Artyom’s background in the data space (0:32)The growth and changes at Cube (5:58)Pain points of managing metrics definitions across different tools (9:39)Trade-offs between coupled and decoupled semantic layers (12:12)Making a case for implementing a semantic layer (14:17)The evolution of semantic layers (23:28)Challenges in designing a decoupled semantic layer (24:16)Different approaches to solving the interface problem (26:58)Implementing a SQL engine in Cube (35:58)Overhead and debugging in semantic layers (39:08)The semantic layer and its importance (46:26)The need for semantics in data products (47:34)What’s the future of semantic layers and user experience? (51:49)Final thoughts and takeaways (57:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 22, 2024 • 3min
The PRQL: Why is a Semantic Layer Important in the Modern Data Stack? Featuring Artyom Keydunov of Cube Dev
In this bonus episode, Eric and Kostas preview their upcoming conversation with Artyom Keydunov of Cube Dev. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

Jan 17, 2024 • 47min
173: Data Analytics Is a Team Sport, Featuring Jay Henderson of Alteryx
Highlights from this week’s conversation include:No Code Analytics (1:22)Analytics as a Team Sport (2:31)The workflow of someone without Alteryx (11:27)Alteryx's ability to handle diverse data sources (14:32)The balance between ease of use and complexity (23:06)Enabling casual end users with a no code interface (24:19)Taking analytics to the data (31:47)The boundaries between data engineers and end users (33:44)The importance of collaboration in analytics (34:12)The potential of every employee being a data worker (35:28)The human nature of the product and users in large enterprises (00:45:38)Final thoughts and takeaways (46:21)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 15, 2024 • 4min
The PRQL: Bridging the Gap Between Messy Data and Sophisticated Analytics with Jay Henderson of Alteryx
In this bonus episode, Eric and Kostas preview their upcoming conversation with Jay Henderson of Alteryx. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.