The Data Stack Show

Rudderstack

Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Episodes

Mentioned books

10 snips

Jan 10, 2024 • 56min

172: How WebAssembly is Enabling the Third Wave of Cloud Compute with Matt Butcher of Fermyon Technologies

Matt Butcher, Co-founder of Fermyon Technologies and WebAssembly expert, discusses his background, the potential of WebAssembly for cloud computing, the benefits of WebAssembly, and the challenges and progress in this field. Topics include enhanced security models, Google's early containers, scaling and anticipating requests, comparison of virtual machines, containers, and micro VMs, fast startup times in WebAssembly, metaphysics and software development, effective communication in code development, and requirements of different teams and jobs.

Jan 8, 2024 • 5min

The PRQL: WebAssembly: The Future of Cloud Workloads Made Simple with Matt Butcher of Fermyon Technologies

In this bonus episode, Eric and Kostas preview their upcoming conversation with Matt Butcher of Fermyon Technologies. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Jan 3, 2024 • 56min

171: Machine Learning Pipelines Are Still Data Pipelines with Sandy Ryza of Dagster

Guest Sandy Ryza, an expert in machine learning pipelines, discusses the role of orchestrators in the lifecycle of data, changes in data ops and MLOps, data cleaning, and the overview of Dagster. They also explore the difference between data assets and tasks in data pipelines, defining lineage and data assets in Dagster, and the benefits of a unified orchestration framework. Additionally, they touch on orchestration in the development phase and the emergence of the analytics engineer role.

Jan 2, 2024 • 4min

The PRQL: Does Machine Learning Need Its Own Orchestrator? Featuring Sandy Ryza of Dagster

Sandy Ryza from Dagster Labs discusses the role of an orchestrator in Data Ops and ML Operations. They also emphasize the need for diverse solutions in the ML operations space.

Dec 27, 2023 • 54min

170: Discussing Data Roles and Solving Data Problems with Katie Bauer of GlossGenius

Highlights from this week’s conversation include:The evolution of the data scientist role (1:03)Common problems in different companies (2:05)Measuring and curating content on Reddit (4:29)The challenges of working with unstructured content at Reddit and Twitter (11:03)Lessons learned from Reddit and applying them at Twitter (13:17)Data challenges and customer behavior analysis at GlossGenius (20:16)How the data scientist's role has changed over time (00:25:10)The essence of the data scientist/engineer role (29:00)Dynamics and overlaps between different data roles (32:09)The perfect data team for Twitter (34:19)Building a data team at a startup like GlossGenius (36:36)The right time to bring in a dedicated data person in a startup (38:52)The analytics engineer role (46:25)Challenges in implementing telemetry (50:31)Final thoughts and takeaways (52:24)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 26, 2023 • 3min

The PRQL: What is a Data Scientist? Featuring Katie Bauer of GlossGenius

In this bonus episode, Eric and Kostas preview their upcoming conversation with Katie Bauer of GlossGenius. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 20, 2023 • 1h 6min

169: Data Models: From Warehouse to Business Impact with Tasso Argyros of ActionIQ

Highlights from this week’s conversation include:The Evolution of Databases and Data Systems (2:33)Abstracting Data for Business Users (4:31)Building a Database for Google-like Search (7:58)The Big Data Explosion (11:10)Selling Myspace as First Customer (13:14)Starting ActionIQ (16:57)The customer-centric organization (22:46)Transitioning to customer data focus (23:53)Understanding business users' needs (28:30)Supporting Arbitrary Queries and Data Models (34:42)Unique Technical Perspective of Clickstream Data (37:01)The value per terabyte of data (46:45)Building a product for multiple personas (50:45)Composability and Benefits (58:05)Evolution of Storage and Compute (1:00:09)Composability and Treasure Data (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 18, 2023 • 6min

The PRQL: From Databases to Customer Data Platforms with Tasso Argyros of ActionIQ

In this bonus episode, Eric and Kostas preview their upcoming conversation with Tasso Argyros of ActionIQ. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 13, 2023 • 57min

168: Decoding Data Mesh: Principles, Practices, and Real-World Applications Featuring Paolo Platter, Zhamak Dehghani, and Melissa Logan

Highlights from this week’s conversation include:Defining data mesh (6:37)Addressing the scale of organizational complexity and usage (9:04)The shift from monolithic to microservices (12:24)The sociological structure in data mesh (13:59)Data product generation and sharing in data mesh (17:27)Data Mesh: Simplifying Data Work (24:09)Getting Started with Data Mesh (29:14)Building products for Data Mesh (36:42)Building a customizable and extensible platform to shape data practice (39:28)The characteristics of a data product (48:40)Defining what a data product is not (50:45)The origin of the term "mesh" in data mesh (53:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Dec 11, 2023 • 3min

The PRQL: A Data Mesh Deep Dive with Paolo Platter, Zhamak Dehghani, and Melissa Logan

In this bonus episode, Eric and Kostas preview their upcoming conversation regarding Data Mesh with Paolo Platter, Zhamak Dehghani, and Melissa Logan. Hosted by Simplecast, an AdsWizz company. See https://pcm.adswizz.com for information about our collection and use of personal data for advertising.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app