The Data Stack Show cover image

The Data Stack Show

Latest episodes

undefined
Feb 10, 2021 • 51min

24: Demystifying AI with Duc Haba

On this week’s episode of The Data Stack Show, Eric is joined by Duc Haba, an AI researcher and enterprise mobility solution architect consultant who most recently did AI consulting work with Cognizant. Their discussion revolves around demystifying artificial intelligence and why so many people either fear AI or place too much trust in it. Duc talks about some of the AI projects he has worked on, some successes and some failures, and points to how the data biases that humans bring into the models can radically alter the outcome of those endeavors.Highlights from this week’s episode include:Duc's background with AI and getting to work with LeVar Burton (1:44)Demystifying AI and coming up with a definition for it (3:34)Misplaced fears of AI (7:53)Misplaced trust in AI (10:36)Public versus hidden AI (13:58)Acquiring the data needed for to train AI models (23:11)Examples of interesting AI projects Duc has worked on (27:58)Where to go to learn more about AI (35:06)Thinking of AI as something that can help your business do something better with what it's already been doing (39:53)Anticipating the near-future of AI (44:16)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Feb 3, 2021 • 43min

23: Migrating from On-Premises to the Cloud with Alex Lancaster from Intuit

On this week’s episode of The Data Stack Show, Kostas and Eric are joined by the risk data engineering manager at Intuit, Alex Lancaster. Alex has been with Intuit, known for its products like QuickBooks, TurboTax, Mint and more, for 15 years and was part of a recent massive and successful re-architecturing from on prem to cloud-based.Highlights from this week’s episode include:Alex and his role at Intuit (1:51)Data marts at Intuit (2:57)Revolutionary changes in the data engineering space in the past 15 years (6:46)Security in the cloud vs. on prem (12:46)Data architecture at Intuit (15:42)Doing ETLs inside or outside of the database (19:11)How to transition successfully from on prem to cloud. Forklifting vs. re-stacking (23:22)Alex’s application of software engineering skills to data engineering (28:44)Dealing with data engineering challenges related to security and regulation (31:48)Pipelines managed and challenges in data types (36:45)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jan 29, 2021 • 30min

22: Season One Recap with Eric Dodds and Kostas Pardalis

Season One of The Data Stack Show is in the books, and in this episode, Kostas and Eric take a look back at some of the biggest takeaways, trends, and topics from the season. With some great guests already set for season two, the next slate of episodes is shaping up to take an even deeper dive into the world of data and the people shaping it.Key points in the conversation include:Patterns with data warehouses and data lakes (3:38)Looking back at the people behind the data and their stories (8:12)Minimizing flaws while remembering that data is built by humans, for humans (11:02) Using proven technology and making mature solutions (15:20)Data involves a significant amount of trust (23:38)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jan 20, 2021 • 46min

21: Data Integrity and Governance with Patrick Thompson and Ondrej Hrebicek from Iteratively

On this week’s episode of The Data Stack Show, Kostas and Eric are joined by the co-founders of Iteratively, CEO Patrick Thompson and CTO Ondrej Hrebicek. Iteratively helps companies know that their data can be trusted by helping capture clean, consistent product analytics. Today’s conversation digs into the behind the scenes of Iteratively and how trust in data can help accelerate the velocity of an organization.Highlights from this week’s episode include:Patrick and Ondrej’s background and the biggest problem Iteratively addresses (2:50)Why some companies still use spreadsheet schema management and the potential pitfalls they’re setting themselves up for with this (4:39)Defining schema in the context of data (7:02)Viewing the process as a team sport (11:34)Identifying common mistakes and implementing best practices (13:46)A walkthrough of Iteratively (17:13)Utilizing a JSON schema format (26:58)Laying Iteratively on top of or integrating it with an implementation for analytics (30:36)Entry point into organizations (33:02)Organizational change and velocity realized after implementing Iteratively (36:04)What’s next for Iteratively? (42:47)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jan 13, 2021 • 53min

20: Transforming the Real Estate Market with Predictive Analytics with Arian Osman from Homesnap

This week on The Data Stack Show, Kostas and Eric are joined by Arian Osman, a senior data scientist at Homesnap who is also nearing the end of his PhD in computational sciences and informatics and is the developer of an e-commerce clothing brand. Homesnap is designed for both homebuyers and agents to access data from the MLS (Multiple Listing Service), providing real-time, accurate information to all parties involved.Highlights from this week’s episode include:Arian’s background and an overview of Homesnap (2:30)Utilizing data in Arian’s e-commerce clothing brand (7:14)Homesnap’s sell speed feature and visualizing outputs (13:28)The psychology that drives upper and lower limits (19:33)Deciding the life-cycle of a model (25:50)Collaborating with internal stakeholders (30:47)Unique challenges of data in the real estate domain (38:16)Useful third-party tools (43:33)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jan 6, 2021 • 53min

19: Defining Data Governance with Stephen Bailey from Immuta

This week on The Data Stack Show, Kostas and Eric are joined by Stephen Bailey, Director of Applied Data Science at Immuta. Immuta is a startup that focuses on enabling data teams to have really fast, efficient, and understandable access controls on their data. Highlights from this week’s episode include:The problem that Immuta solves (2:04)Stephen’s background researching how the brain works (4:56)Immuta’s stack (15:09)Leveraging metadata (18:02)The main use case for Immuta is simplifying the access control layer (20:06)Unifying data (31:52)Defining the quality of data (34:04)Learning to trust the numbers (39:42)What’s next for Immuta (46:15)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Dec 31, 2020 • 55min

18: Data Science in Health Insurance with Jason Haupt of Bind

This week on The Data Stack Show, Kostas and Eric are joined by Jason Haupt, data science lead at Bind, a no-deductible health insurance company determined to give immediate answers and clear costs before point of care. Jason’s unique background of having a Ph.D. in particle physics and working at the Large Hadron Collider at CERN have informed the way he goes about approaching data at Bind.Highlights from this week’s episode include:Jason’s background in particle physics and his path to Bind (2:53)A cloud-only approach to data and utilizing AWS (9:01)Focusing on activities that help its members (12:08)Dealing with 12,000 columns of data from an insurance claim form (17:13)Rethinking the relationship between marketing and product teams (25:28)Examining the data pipeline (29:30)Privacy and security concerns with medical information (35:45)How experience with the LHC impacted the way he thinks about data (40:06)Transition from academic work to industry (46:20)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Dec 9, 2020 • 57min

17: Working with Data at Netflix with Ioannis Papapanagiotou

This week on The Data Stack Show, Kostas and Eric are joined by Ioannis Papapanagiotou, senior engineering manager at Netflix. Ioannis oversees Netflix’s data storage platform and its data integration platform. Their conversation highlighted the various responsibilities his lean teams have, utilizing open source technology and incorporating change data capture solutions.Key points in this week’s episode include:Ioannis’ background with academia and Netflix (4:42)Comparing the data storage and data integration teams (6:19)Discussing indexing and encryption (20:31)Netflix’s role in the open source community (27:21)Implementing change data capture (40:42)Using Bulldozer to efficiently move data in batches from data warehouse tables to key-value stores (42:43)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Dec 3, 2020 • 46min

16: Applying the Event Sourcing Pattern at Scale with Andrew Elster from Earnnest

On this week’s episode of The Data Stack Show, Kostas and Eric finish part two of a conversation about Earnnest, a digital platform originally designed for facilitating real estate transactions. In the previous episode, they talked with the CTO and co-founder Daniel Jeffords, and in this week’s episode, they talked with the other co-founder, Andrew Elster, CIO and chief architect. Andrew describes more about Earnnest’s stack and their decision to utilize Elixir and talks about their vision for scaling up their product.Key topics in the conversation include:Andrew’s journey from electrical engineering, to avoiding pirates in oceanic oil exploration, to starting Earnnest (2:57)Keeping the platform flexible to expand beyond real estate transactions (10:24)Being adaptable to support existing workflows (18:33)The evolution of the database and implementing event sourcing (25:01)Using a functional language like Elixir (30:54)Developing Earnnest with scale in mind (37:33)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Nov 19, 2020 • 48min

15: Early Stage Analytics and Learning from the Y Combinator Experience with Axel Delafosse from Pool

This week on The Data Stack Show, Kostas and Eric are joined by Axel Delafosse, founder and CEO of Pool, a messaging app designed to help couples spend less time deciding what to do and spend more time together. Axel shares his story of how he went from having his idea being shot down in person by Paul Graham to being accepted for Y Combinator. While Pool is still a young startup, Axel offers wise insight from lessons he’s learned along the way.Highlights from this week’s episode include:Pool Messenger, “the ultimate antidote to decision paralysis” (2:50)Pitching to Paul Graham and applying to YC (6:17)The importance of the co-founder relationship (14:01)The YC experience and losing Facebook’s API (17:37)Products die, relationships last (22:05)Breaking down the data stack (28:50)Using data and conversations with users to evaluate the experience (36:12)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode