The Data Stack Show cover image

The Data Stack Show

Latest episodes

undefined
May 24, 2023 • 58min

139: Decoupling the Execution Engine From Python’s Pandas with Aditya Parameswaran of Ponder

Highlights from this week’s conversation include:Aditya’s background and journey in the data space (2:47)What does Ponder do? (5:18)101 on Pandas and why people utilize it (6:42)The challenge of translating Pandas to a big data platform (16:11)Data Warehouses and ML workflows (21:27)The differences in the “zoo” of data languages (26:56)Why do ML and data engineering have to be so different in languages? (34:39)Builders should be adapting to the users and not the other way around (39:32)Will we see a singular data interface in the future? (46:19)Aditya’s most surprising discovery in his research (50:40)Final thoughts and takeaways (53:18)Read more of Aditya's work: Pandas vs. SQL – Part 1: The Food Court and the Michelin-Style RestaurantPandas vs. SQL – Part 2: Pandas Is More ConcisePandas vs. SQL – Part 3: Pandas Is More FlexiblePandas vs. SQL – Part 4: Pandas Is More ConvenientThe Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 22, 2023 • 4min

The PRQL: Removing the Execution Engine Language Barrier with Aditya Parameswaran of Ponder

In this bonus episode, Eric and Kostas preview their upcoming conversation with Aditya Parameswaran of Ponder.
undefined
May 17, 2023 • 1h 2min

138: Paradigm Shift: Batch to Data Streaming with A.J. Hunyady of InfinyOn

Highlights from this week’s conversation include:A.J.’s background and journey in data (2:23)Challenges with Hadoop ecosystem (8:50)Starting InfinyOn and the need for innovation (10:02)Challenges with Kafka and Microservices (14:01)Real-time data streaming for IoT devices (19:28)Paradigm shift to real-time data processing (22:17)Benefits of Rust (29:45)Web Assembly and Platform Features (36:29)Analytics and Event Correlation (40:16)Real-time data processing (47:03)ETL vs ELP (52:20)Final thoughts and takeaways (57:07)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 15, 2023 • 5min

The PRQL: Data Infrastructure Systems and the Rust / WebAssembly Combo with A.J. Hunyady of InfinyOn

In this bonus episode, Eric and Kostas preview their upcoming conversation with A.J. Hunyady, Founder and CEO of InfinyOn.
undefined
May 10, 2023 • 59min

137: Data Collection Secrets & The Search Data Problem with Josh Wills

Highlights from this week’s conversation include:Josh’s background in data working at Google, Slack, and other companies (1:21)The need and process for high quality data (4:33)Digging into auction code (14:03)Joining Slack and working in the early days of the company (18:00)Not fighting the last war in data (25:42)Building a product, while using the product (30:35)Transitioning to the search team at Slack (36:50)Usage patterns of search (41:21)Josh’s work in helping build DuckDB (46:20)Having the right toolset to increase precision and efficiency (52:42)Final thoughts and takeaways (56:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 8, 2023 • 2min

The PRQL: Data Engineers in the Front End with Josh Wills

In this bonus episode, Eric previews his upcoming conversation with Josh Wills, an experienced data scientist who has worked with IBM, Google, Slack, DuckDB, and more.
undefined
May 3, 2023 • 1h

136: System Evolution from Hadoop to RocksDB with Dhruba Borthakur of Rockset

Highlights from this week’s conversation include:Dhruba’s journey into the data space (2:02)The impact of Hadoop on the industry (3:37)Dhruba’s work in the early days of the Facebook team (7:54)Building and implementing RocksDB (14:33)Stories with Mark Zuckerberg at Facebook (24:25)The next evolution in storage hardware (26:14)How Rockset is different from other real-time platforms (33:13)Going from a key value store to an index (37:15)Where does Rockset go from here? (44:59)The success of RocksDB as an open source project (49:11)How do we properly steward real-time technology for impact (51:17)Final thoughts and takeaways (56:18)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
May 1, 2023 • 3min

The PRQL: Hardware Innovation Begets Software Innovation with Dhruba Borthakur Co-Founder and CTO, Rockset

In this bonus episode, Eric and Kostas preview their upcoming conversation with Dhruba Borthakur of Rockset.
undefined
Apr 28, 2023 • 15min

Data Council Week (Ep 7) - What’s Next for Data Council? With Pete Soderling of Data Council

Highlights from this week’s conversation include:The origin story of Data Council (0:39)Developments for the future of Data Council (2:42)The emphasis of AI and ChatGPT at this year’s conference (3:54)The support of the data community (5:31)Biggest changes and innovations in the industry (7:10)What’s next for the Data Council? (10:46)Getting connected with Data Council (13:07)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
undefined
4 snips
Apr 27, 2023 • 40min

Data Council Week (Ep 6) - All About Debezium and Change Data Capture With Gunnar Morling of Decodable

Gunnar Morling discusses Debezium's replication of data, working with Kafka, importance of documentation in open-source projects, and the vision moving forward. They cover the challenges of CDC open-source solutions and the importance of building a diverse system with common interfaces.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode