The Data Stack Show cover image

The Data Stack Show

Latest episodes

undefined
Sep 22, 2021 • 1h 9min

54: The Center of the Modern Data Stack with Neil Rahilly of Mixpanel

Highlights from this week’s conversation include:Neil’s programming hobby turned into a career and how he cold-contacted Mixpanel for a job (2:28)Lessons learned from nine years at Mixpanel (5:05)Defining product analytics (8:06)How Mixpanel has evolved into the product it is today (10:56)The importance of Mixpanel’s real-time analysis (19:52)Looking at Arb, Mixpanel’s own arbitrary segmentation database (23:44)The business impact that the rise of the cloud data warehouse had on Mixpanel (34:56)Sub-second latencies and real-time use cases (49:05)Career advice from Neil (1:02:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Sep 15, 2021 • 1h 20min

53: What Religion, a Cult, and a Tech Product Have in Common, with Bart Farrell of DoKC

Highlights from this week’s conversation include:Bart’s journey from southern California, to New York, to Egypt, to London, to Spain (3:31)Exposure to different communities and finding shared language and experience (10:21)Looking back at early online communities and how they furthered your learning journey (27:50)How the level of niche-ness impacts a community (44:06)The cautionary tale of WeWork (57:28)Surefire community killers (1:03:44)Open source communities in tech and the passion that drives them (1:08:11)Follow the Data on Kubernetes Community at DoK.community and on Twitter at @DoKCommunity. You can follow Bart at @birthmarkbart.The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Sep 8, 2021 • 1h 9min

52: Discussing Data Warehouses, Lakes, and Meshes with James Serra of EY

Highlights from this week’s conversation include:James’ background at Microsoft and current work with EY’s data fabric (2:22)The external and internal facing components of EY’s data fabric (6:39)The importance of the data lineage (11:29)The most important requirements for data quality (15:32)Looking at the data capabilities of Microsoft (21:30)The data warehouse, explained (29:00)Using a data warehouse or a data lake (34:33)Defining the buzzword data mesh (51:13)The problem with data mesh (59:31)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Sep 1, 2021 • 55min

51: Democratizing AI and ML with Tristan Zajonc of Continual

Topics in this wide-ranging conversation include: Tristan’s background with Cloudera and the need for continual operational ML and AI (3:15)How the complexity of Continual is hidden behind a simplicity of use (14:48)Focusing on data that lives within a data warehouse (18:43)Understanding features in the ML conversation (22:47)The three layers of Continual (26:11)The importance of SQL to Continual (30:19)Caching layers and the data warehouse centric approach (38:28)Betting on the warehouse being a central component of data stack architecture (43:34)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 25, 2021 • 59min

50: From Data Infrastructure to Data Management with Ananth Packkildurai

Highlights from this week’s episode:Ananth’s background (2:51)The evolution of Slack (4:54)Kafka and Presto’s two of the most reliable and flexible tools for Ananth (9:43)How Snowflake gained an advantage over Presto (13:24)Opinions about data lakes (17:23)Core features of data infrastructure (23:22)The tools define the process, and not the other way around (31:30)Defining a data mesh (36:44)Data is inherently social in nature (40:31)Lessons learned from writing Data Engineering Weekly (49:14)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 18, 2021 • 55min

49: MLops - The Finalization of the Data Stack with Ben Rogojan of Facebook

Topics in this conversation include: Ben's background and his shift to data engineering (2:19)Trends in the data space: finding the most efficient tools, the Snowflake phenomenon, and keeping up with new functionalities (5:33)Key differences in data practices in small companies and Facebook-sized companies (12:38)Having to build tools specifically designed for Facebook because of SaaS product limitations (16:00)Team structure at Facebook (18:17)Developing more robust systems that are resistent to pipeline failure (19:50)Defining data stacks (24:01)A sample data stack for a young company (28:37)Why Redshift and Snowflake have trended in the opposite direction (33:02)BigQuery and Snowflake comparisons (36:06)MLOps and whose responsibility is it (39:12)Feast, Tecton, and feature stores (45:40)Having a good community around an open source product (49:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 11, 2021 • 33min

48: Season Two Recap with Eric Dodds and Kostas Pardalis

Highlights from this week’s episode:Dissecting the different team structures from organizations in season two (1:16)The people behind the data are key to the data itself (9:17)Open source licensing and the core components needed for large scale commercial viability (15:13)Game-changing core technologies in the new data economy (22:09)Snowflake vs. Databricks battle. "The UFC of Geeks" (25:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 4, 2021 • 51min

47: Taming the Four Dragons of Data with Sven Balnojan of Mercateo Gruppe

Highlights from this week’s episode include:Sven's Ph.D. in Singularity Theory (2:59)The Databricks vs. Snowflake conversation (8:17)The difficulty of not just inventing something new, but making it accessible (18:01)Databricks and unstructured data (22:22)Organizational change responding to technological change (29:27)The three-dimensional evolution of a successful open source project (40:31)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 28, 2021 • 56min

46: A New Paradigm in Stream Processing with Arjun Narayan of Materialize

Highlights from this week’s episode include:Introducing Arjun and how he fell in love with databases (2:51)Looking at what Materialize brings to the stack (5:28)Analytics starts with a human in the loop and comes into its own when analysts get themselves out and automate it (15:46)Using Materialize instead of the materialized view from another tool (18:44)Comparing Postgres and Materialize and looking at what's under the hood of Materialize (23:16)Making Materialize simple to use (32:33)Why Materialize doubled down on writing 100% in Rust (35:43)The best use case to start with (42:03)Lessons learned from making Materialize a cloud offering (44:22)Keeping databases to the cloud for low latency (48:31) The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 21, 2021 • 56min

45: Open Source and Attribution with Ophir Prusak of Codesmith

Highlights from today's conversation include:Ophir's decision to switch from software engineering to marketing and riding the startup train (2:39)Open sourcing in the world of software (5:55)How open source has changed Ophir's life as a marketeer working at startups (10:28)Chartio's sunsetting drove Ophir to search for a data tooling replacement (27:27)Discussing trends in adoption of tools for small scale and large scale companies (35:01)Data challenges related to attribution--how wrong do you want to be?  (44:07)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode