The Data Stack Show

Rudderstack
undefined
Aug 18, 2021 • 55min

49: MLops - The Finalization of the Data Stack with Ben Rogojan of Facebook

Topics in this conversation include: Ben's background and his shift to data engineering (2:19)Trends in the data space: finding the most efficient tools, the Snowflake phenomenon, and keeping up with new functionalities (5:33)Key differences in data practices in small companies and Facebook-sized companies (12:38)Having to build tools specifically designed for Facebook because of SaaS product limitations (16:00)Team structure at Facebook (18:17)Developing more robust systems that are resistent to pipeline failure (19:50)Defining data stacks (24:01)A sample data stack for a young company (28:37)Why Redshift and Snowflake have trended in the opposite direction (33:02)BigQuery and Snowflake comparisons (36:06)MLOps and whose responsibility is it (39:12)Feast, Tecton, and feature stores (45:40)Having a good community around an open source product (49:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 11, 2021 • 33min

48: Season Two Recap with Eric Dodds and Kostas Pardalis

Highlights from this week’s episode:Dissecting the different team structures from organizations in season two (1:16)The people behind the data are key to the data itself (9:17)Open source licensing and the core components needed for large scale commercial viability (15:13)Game-changing core technologies in the new data economy (22:09)Snowflake vs. Databricks battle. "The UFC of Geeks" (25:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Aug 4, 2021 • 51min

47: Taming the Four Dragons of Data with Sven Balnojan of Mercateo Gruppe

Sven Balnojan, a PhD in Singularity Theory, shares his insights as a writer and product manager in the data realm. He dives into the dynamics between Databricks and Snowflake, highlighting their distinct approaches to structured and unstructured data. The conversation also tackles the challenges of innovating and making technology accessible, as well as the critical need for organizational change in response to emerging tech. Additionally, Sven discusses the complexities of successful open-source projects and the importance of iterative development.
undefined
Jul 28, 2021 • 56min

46: A New Paradigm in Stream Processing with Arjun Narayan of Materialize

Highlights from this week’s episode include:Introducing Arjun and how he fell in love with databases (2:51)Looking at what Materialize brings to the stack (5:28)Analytics starts with a human in the loop and comes into its own when analysts get themselves out and automate it (15:46)Using Materialize instead of the materialized view from another tool (18:44)Comparing Postgres and Materialize and looking at what's under the hood of Materialize (23:16)Making Materialize simple to use (32:33)Why Materialize doubled down on writing 100% in Rust (35:43)The best use case to start with (42:03)Lessons learned from making Materialize a cloud offering (44:22)Keeping databases to the cloud for low latency (48:31) The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 21, 2021 • 56min

45: Open Source and Attribution with Ophir Prusak of Codesmith

Highlights from today's conversation include:Ophir's decision to switch from software engineering to marketing and riding the startup train (2:39)Open sourcing in the world of software (5:55)How open source has changed Ophir's life as a marketeer working at startups (10:28)Chartio's sunsetting drove Ophir to search for a data tooling replacement (27:27)Discussing trends in adoption of tools for small scale and large scale companies (35:01)Data challenges related to attribution--how wrong do you want to be?  (44:07)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 14, 2021 • 58min

44: Leveraging Data in a Post-Covid World with Ruben Ugarte of Practico Analytics

Highlights from this week's episode: Ruben's background (2:36)Massive shifts in data caused by COVID (4:47)Big Tech is no longer untouchable (9:54)Accelerations in the BI space (15:17)A focus on people and on trust (23:43)Numbers are filtered by the biases of the people viewing them (28:46)AI trends and adoption (38:06)Using qualitative data for insights, particularly at early stages (40:56)Recommendations for taking stock of who is using the data and assessing what their skills are (50:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jul 7, 2021 • 46min

43: Modern Authentication and User Management with Sokratis Vidros of Clerk.dev

Highlights from this week's episode:Sokratis' realization that big corporations were not the best thing for him (2:56)Transitioning for Workable to Clerk.dev (3:40)Convincing developers to outsource components to a service (9:36)Clerk's layered solutions and how it affects the developer and the end-user (12:41)Starting with Clerk from scratch vs. using Clerk to replace an existing component (19:55)Synergies and SaaS starter kit (24:06)Building Clerk to avoid a single point of failure (29:19)Reflecting back on the transformation and growth of Workable, and how it was like working at eight different companies (33:03)Lessons that Sokratis has taken away from his years as a developer (42:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 30, 2021 • 53min

42: Scaling Data Science with Ryan Boyer of Shipt

Highlights from this week’s episode include:Ryan's full circle path from stocking shelves at Target to using data science for a company owned by Target (2:00)Building great tools and wielding them effectively (5:04)Changes at Shipt since being acquired (9:29)How people’s bias impacts models built by data scientists (12:30)The different data sources Shipt incorporates (22:02)How Ryan's work as a data scientist has changed as Shipt has grown (25:29)How data science helps marketing (31:38)Improving search experience (34:23)Shipt's evolving data stack (38:27)New trends in data science (47:06)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 23, 2021 • 50min

41: Doing MLOps on Top of Apache Pulsar and Trino with Joshua Odmark of Pandio

Highlights from this week’s episode:Joshua started his first company at age 15 and then sold two more startups after that (2:15)Embracing the open source movement and not reinventing the wheel if you don't have to (12:15)Pulsar seemed built to address Kafka's weaknesses (17:23)Using Redis as a coordinator for federated learning and taking advantage of its portability (23:05)The pillars of Pandio and some practical use cases (31:24)Feature stores and model versioning (38:23)Seeing Pulsar as the future because of the ability to run tens of millions of topics (41:04)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
Jun 16, 2021 • 58min

40: Graph Processing on Snowflake for Customer Behavioral Analytics

Highlights from this week’s episode include:Launching Affinio and the engineering backgrounds of the co-founders (2:36)The massive transformation in customer data privacy regulation in the past eight years (6:23)Creating the underpinning technology that can apply to any customer behavioral data set (10:05)Ranking and scoring surfing patterns and sorting nodes and edges (14:13)Placing the importance of attributes into a simple UI experience (19:28)Going from a columnar database to a graph processing system (25:20)Working with custom or atypical data (32:46)The decision to work with Snowflake (37:43)Next steps for utilizing third-party tools within Snowflake (52:18)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app