

The Data Stack Show
Rudderstack
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Episodes
Mentioned books

6 snips
Oct 25, 2023 • 1h 21min
161: The Intersection of Generative AI and Data Infrastructure with Chang She of LanceDB
Highlights from the podcast include the challenges in data collection, AI hype impact, LanceDB's file and table format, Vector Database introduction, importance of unstructured data, potential of generative AI, and changing expectations in information systems.

Oct 23, 2023 • 5min
The PRQL: How Did Pandas Become a Data Science Powerhouse? Featuring Chang She of Eto Labs
In this bonus episode, Eric and Kostas preview their upcoming conversation with Chang She of Eto Labs.

Oct 18, 2023 • 1h 6min
160: Closing the Gap Between Dev Teams and Data Teams with Santona Tuli of Upsolver
Highlights from this week’s conversation include:Santona’s journey from nuclear physics to data science (4:59)The appeal of startups and wearing multiple hats (8:12)The challenge of pseudoscience in the news (10:24)Approaching data with creativity and rigor (13:22)Challenges and differences in data workflows (14:39)Schema Evolution and Quality Problems (27:01)Real-time Data Monitoring and Anomaly Detection (30:34)The importance of data as a business differentiator (35:48)The SQL job creation process (46:25)Different options for creating solver jobs (47:20)Adding column-level expectations (50:17)Discussing the differences of working with data as a scientist and in a startup (1:00:18)Final thoughts and takeaways (1:04:01)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Oct 16, 2023 • 6min
The PRQL: The Intersection of Physics, Data Science, and Product Development with Santona Tuli of Upsolver
In this bonus episode, Eric and Kostas preview their upcoming conversation with Santona Tuli of Upsolver.

Oct 11, 2023 • 1h 9min
159: What Is a Vector Database? Featuring Bob van Luijt of Weaviate
Highlights from this week’s conversation include:How music impacted Bob’s data journey (3:16)Music’s relationship with creativity and innovation (11:38)The genesis of Weaviate and the idea of vector databases (14:09)The joy of creation (19:02)OLAP Databases (22:21)The progression of complexity in databases (24:31)Vector database (29:23)Scaling suboptimal algorithms (34:34)The future of vector space representation (35:51)Databases role in different industries (39:14)The brute force approach to discovery (45:57)Retrieval augmented generation (51:26)How generative model interacts with the database (57:55)Final thoughts and takeaways (1:03:20)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Oct 9, 2023 • 5min
The PRQL: Enhancing Search and Recommendation Systems with Vector Databases with Bob van Luijt of Weaviate
In this bonus conversation, Eric and Kostas preview their upcoming conversation with Bob van Luijt of Weaviate.

Oct 4, 2023 • 1h 2min
158: The Orchestration Layer as the Data Platform Control Plane With Nick Schrock of Dagster Labs
Nick Schrock, Founder of Dagster Labs, discusses his background in data engineering and the founding of Dagster Labs. They cover topics such as the evolution of data engineering, fragmentation in data infrastructure, the role of orchestration in data platforms, lessons learned from working with GraphQL, different orchestrators in the data infrastructure landscape, the role of MLOps in data engineering, and the future of data teams and orchestration.

Oct 2, 2023 • 3min
The PRQL: The Power of Data Orchestration: A Game-Changer for Data Infrastructure, Featuring Nick Schrock of Dagster Labs
Nick Schrock, Co-founder of Dagster Labs, discusses the power of data orchestration and its impact on data infrastructure. He explores the history, current state, and future of orchestration, and shares insights from his experience at Facebook.

Sep 27, 2023 • 1h 4min
157: From Search Engine to Answer Engine Using Grounded Generative AI, Featuring Amr Awadallah of Vectara
Highlights from this week’s conversation include:Amr’s extensive background in data (3:23)The evolution of neural networks (9:21)The role of supervised learning in AI (11:17)Explaining Vectara (13:07)Papers that laid the foundation for AI (15:02)Contextualized translation and personalization (20:07)Ease of use and answer-based search (25:01)AI and potential liabilities (35:54)Minimizing difficulties in large language models (36:43)The process of extracting documents in multidimensional space (44:47)Summarization process (46:33)The danger of humans misusing technology (54:59)Final thoughts and takeaways (57:12)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Sep 25, 2023 • 5min
The PRQL: How Can Large Language Models Revolutionize Decision-Making? Featuring Amr Awadallah of Vectara
In this bonus episode, Eric and Kostas preview their upcoming conversation with Amr Awadallah of Vectara.