

The Data Stack Show
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Episodes
Mentioned books

10 snips
Jan 10, 2024 • 56min
172: How WebAssembly is Enabling the Third Wave of Cloud Compute with Matt Butcher of Fermyon Technologies
Matt Butcher, Co-founder of Fermyon Technologies and WebAssembly expert, discusses his background, the potential of WebAssembly for cloud computing, the benefits of WebAssembly, and the challenges and progress in this field. Topics include enhanced security models, Google's early containers, scaling and anticipating requests, comparison of virtual machines, containers, and micro VMs, fast startup times in WebAssembly, metaphysics and software development, effective communication in code development, and requirements of different teams and jobs.

Jan 8, 2024 • 5min
The PRQL: WebAssembly: The Future of Cloud Workloads Made Simple with Matt Butcher of Fermyon Technologies
In this bonus episode, Eric and Kostas preview their upcoming conversation with Matt Butcher of Fermyon Technologies. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

Jan 3, 2024 • 56min
171: Machine Learning Pipelines Are Still Data Pipelines with Sandy Ryza of Dagster
Guest Sandy Ryza, an expert in machine learning pipelines, discusses the role of orchestrators in the lifecycle of data, changes in data ops and MLOps, data cleaning, and the overview of Dagster. They also explore the difference between data assets and tasks in data pipelines, defining lineage and data assets in Dagster, and the benefits of a unified orchestration framework. Additionally, they touch on orchestration in the development phase and the emergence of the analytics engineer role.

Jan 2, 2024 • 4min
The PRQL: Does Machine Learning Need Its Own Orchestrator? Featuring Sandy Ryza of Dagster
Sandy Ryza from Dagster Labs discusses the role of an orchestrator in Data Ops and ML Operations. They also emphasize the need for diverse solutions in the ML operations space.

Dec 27, 2023 • 54min
170: Discussing Data Roles and Solving Data Problems with Katie Bauer of GlossGenius
Highlights from this week’s conversation include:The evolution of the data scientist role (1:03)Common problems in different companies (2:05)Measuring and curating content on Reddit (4:29)The challenges of working with unstructured content at Reddit and Twitter (11:03)Lessons learned from Reddit and applying them at Twitter (13:17)Data challenges and customer behavior analysis at GlossGenius (20:16)How the data scientist's role has changed over time (00:25:10)The essence of the data scientist/engineer role (29:00)Dynamics and overlaps between different data roles (32:09)The perfect data team for Twitter (34:19)Building a data team at a startup like GlossGenius (36:36)The right time to bring in a dedicated data person in a startup (38:52)The analytics engineer role (46:25)Challenges in implementing telemetry (50:31)Final thoughts and takeaways (52:24)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 26, 2023 • 3min
The PRQL: What is a Data Scientist? Featuring Katie Bauer of GlossGenius
In this bonus episode, Eric and Kostas preview their upcoming conversation with Katie Bauer of GlossGenius. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

Dec 20, 2023 • 1h 6min
169: Data Models: From Warehouse to Business Impact with Tasso Argyros of ActionIQ
Highlights from this week’s conversation include:The Evolution of Databases and Data Systems (2:33)Abstracting Data for Business Users (4:31)Building a Database for Google-like Search (7:58)The Big Data Explosion (11:10)Selling Myspace as First Customer (13:14)Starting ActionIQ (16:57)The customer-centric organization (22:46)Transitioning to customer data focus (23:53)Understanding business users' needs (28:30)Supporting Arbitrary Queries and Data Models (34:42)Unique Technical Perspective of Clickstream Data (37:01)The value per terabyte of data (46:45)Building a product for multiple personas (50:45)Composability and Benefits (58:05)Evolution of Storage and Compute (1:00:09)Composability and Treasure Data (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 18, 2023 • 6min
The PRQL: From Databases to Customer Data Platforms with Tasso Argyros of ActionIQ
In this bonus episode, Eric and Kostas preview their upcoming conversation with Tasso Argyros of ActionIQ. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.

Dec 13, 2023 • 57min
168: Decoding Data Mesh: Principles, Practices, and Real-World Applications Featuring Paolo Platter, Zhamak Dehghani, and Melissa Logan
Highlights from this week’s conversation include:Defining data mesh (6:37)Addressing the scale of organizational complexity and usage (9:04)The shift from monolithic to microservices (12:24)The sociological structure in data mesh (13:59)Data product generation and sharing in data mesh (17:27)Data Mesh: Simplifying Data Work (24:09)Getting Started with Data Mesh (29:14)Building products for Data Mesh (36:42)Building a customizable and extensible platform to shape data practice (39:28)The characteristics of a data product (48:40)Defining what a data product is not (50:45)The origin of the term "mesh" in data mesh (53:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 11, 2023 • 3min
The PRQL: A Data Mesh Deep Dive with Paolo Platter, Zhamak Dehghani, and Melissa Logan
In this bonus episode, Eric and Kostas preview their upcoming conversation regarding Data Mesh with Paolo Platter, Zhamak Dehghani, and Melissa Logan. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com
for information about our collection and use of personal data for
advertising.