The Data Stack Show cover image

The Data Stack Show

Latest episodes

undefined
Jul 14, 2021 • 58min

44: Leveraging Data in a Post-Covid World with Ruben Ugarte of Practico Analytics

Highlights from this week's episode: Ruben's background (2:36)Massive shifts in data caused by COVID (4:47)Big Tech is no longer untouchable (9:54)Accelerations in the BI space (15:17)A focus on people and on trust (23:43)Numbers are filtered by the biases of the people viewing them (28:46)AI trends and adoption (38:06)Using qualitative data for insights, particularly at early stages (40:56)Recommendations for taking stock of who is using the data and assessing what their skills are (50:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 7, 2021 • 46min

43: Modern Authentication and User Management with Sokratis Vidros of Clerk.dev

Highlights from this week's episode:Sokratis' realization that big corporations were not the best thing for him (2:56)Transitioning for Workable to Clerk.dev (3:40)Convincing developers to outsource components to a service (9:36)Clerk's layered solutions and how it affects the developer and the end-user (12:41)Starting with Clerk from scratch vs. using Clerk to replace an existing component (19:55)Synergies and SaaS starter kit (24:06)Building Clerk to avoid a single point of failure (29:19)Reflecting back on the transformation and growth of Workable, and how it was like working at eight different companies (33:03)Lessons that Sokratis has taken away from his years as a developer (42:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 30, 2021 • 53min

42: Scaling Data Science with Ryan Boyer of Shipt

Highlights from this week’s episode include:Ryan's full circle path from stocking shelves at Target to using data science for a company owned by Target (2:00)Building great tools and wielding them effectively (5:04)Changes at Shipt since being acquired (9:29)How people’s bias impacts models built by data scientists (12:30)The different data sources Shipt incorporates (22:02)How Ryan's work as a data scientist has changed as Shipt has grown (25:29)How data science helps marketing (31:38)Improving search experience (34:23)Shipt's evolving data stack (38:27)New trends in data science (47:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 23, 2021 • 50min

41: Doing MLOps on Top of Apache Pulsar and Trino with Joshua Odmark of Pandio

Highlights from this week’s episode:Joshua started his first company at age 15 and then sold two more startups after that (2:15)Embracing the open source movement and not reinventing the wheel if you don't have to (12:15)Pulsar seemed built to address Kafka's weaknesses (17:23)Using Redis as a coordinator for federated learning and taking advantage of its portability (23:05)The pillars of Pandio and some practical use cases (31:24)Feature stores and model versioning (38:23)Seeing Pulsar as the future because of the ability to run tens of millions of topics (41:04)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 16, 2021 • 58min

40: Graph Processing on Snowflake for Customer Behavioral Analytics

Highlights from this week’s episode include:Launching Affinio and the engineering backgrounds of the co-founders (2:36)The massive transformation in customer data privacy regulation in the past eight years (6:23)Creating the underpinning technology that can apply to any customer behavioral data set (10:05)Ranking and scoring surfing patterns and sorting nodes and edges (14:13)Placing the importance of attributes into a simple UI experience (19:28)Going from a columnar database to a graph processing system (25:20)Working with custom or atypical data (32:46)The decision to work with Snowflake (37:43)Next steps for utilizing third-party tools within Snowflake (52:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 11, 2021 • 50min

39: Diving deeper into CDC with Ali Hamidi and Taron Foxworth of Meroxa

Highlights from this week’s episode include:Meroxa is a real-time data engineering managed platform (4:53)Use cases for CDC (6:20)Meroxa leverages open source tools to provide initial snapshots and start the CDC stream (12:29)Making the platform publicly available (14:14)What the Meroxa user experience looks like (16:10)Raising Series A funding (17:49)Easiest and most difficult data sources for CDC (20:23)The current state of open CDC (23:16)Expected latency when using CDC (29:56CDC, reverse ETL, and a focus on real-time (36:39) Are existing parts of the stack when Meroxa is adapted? (39:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 2, 2021 • 51min

38: Graph Databases & Data Governance with David Allen of Neo4j

Highlights from this week's episode include: David’s background in comparative databases (1:50)David’s experience and lessons he learned from writing his book (3:23)How writing a technical book compares to writing technical documentation (4:41)The process of writing a book (6:30)The best and worst part of David’s book writing experience (8:02)An introduction to what Neo4j is (9:08)What you need to graph (11:13)Typical problems a graph database is a good solution for (13:00)The difference between performance and relational databases (18:41)How Neo4j addresses performance and ergonomics (23:30)Neo4j and scalability (26:20)How Neo4j fits in the modern data stack (31:48)Neo4j use cases (35:45)Practical implementation of Neo4j (40:51)Neo4j’s relationship with open source (45:50)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 26, 2021 • 54min

37: The Components of Data Governance with Dave Melillo of FanDuel

Highlights from this week's episode include:Dave's "nerdy" interests in sports statistics and data (2:12)Trends in collecting, processing, and using data (4:45)Finding a better term for "reverse ETL" (5:48)The blurring of the distinction between sources and destinations (7:41)The role of BI is changing (13:24)Data governance and the physical execution behind it (19:00)Data governance is defining and managing data in a logical way that is actionable by the business (23:43)Consolidation of tools and services (28:49)Databricks vs. Snowflake (33:49) Dave's focus on regulatory data at FanDuel (45:47)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 19, 2021 • 43min

36: Crypto and Compliance with Nick Fogle, Co-Founder of Churnkey and Wavve

On this week's episode of The Data Stack Show, Eric and Kostas talk with Nick Fogle, co-founder of Churnkey and Wavve. Together they discuss how having a legal background can impact engineering decisions, dealing with privacy and compliance concerns, and selling Wavve and starting Churnkey as a result.Highlights from this week's episode include: Nick's background in economics and law and teaching himself to code (2:01)Thinking like a lawyer and trying to minimize risk to the greatest extent possible (4:23)GDPR and compliance (8:23)Blockchain contracts  (18:26)Unique challenges surrounding compliance with a cryptocurrency startup (21:41)Reconciling the right to be forgotten, GDPR, and blockchain permanence (27:16)Building Churnkey after developing it as a way to lower churn among Wavve users (31:31)How Churnkey's stack works (37:16)Crypto predictions (39:02)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 12, 2021 • 54min

35: The Future of Development is Distributed with Jim Walker of Cockroach Labs

This week on The Data Stack Show, Eric and Kostas talk with Jim Walker, the VP of product marketing at Cockroach Labs, about distributed systems, competing against the speed of light, and making data easy.Highlights from this week's episode include: Jim background of translating deep technical concepts into understandable English and his work at Cockroach Labs (2:23)The origin of Cockroach Labs and distributed SQL (6:10) Living without Atomic Clocks (10:10)Having the speed of light as the ultimate competitor (13:49)CockroachDB’s users (19:35)Figuring out big data for transactions (25:14)Dealing with failure (35:04)Open source code, community, and consumption (39:26)Making data easy, and what's next for Cockroach (43:12)Bringing programming into marketing (46:18)Mentioned Links:Spanner White PaperRaft & PaxosMichael Stonebraker The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode