Data Engineering Podcast

Tobias Macey
undefined
24 snips
Aug 12, 2025 • 1h 11min

Bridging Data and Decision-Making: AI's Role in Modern Analytics

Lucas Thelosen and Drew Gilson, co-founders of Gravity, delve into the transformative impact of AI in data analytics. They discuss their creation of Orion, an autonomous data analyst designed to bridge data and decision-making. The conversation highlights how AI democratizes access to data insights for businesses of all sizes, allowing data analysts to focus on strategic tasks. They also emphasize the importance of accuracy and trustworthiness in AI-driven workflows, sharing insights on how companies can cultivate a data-driven culture.
undefined
44 snips
Aug 5, 2025 • 50min

From Bits to Tables: The Evolution of S3 Storage

In this discussion, Andy Warfield, an Amazon storage enhancement expert, dives into the evolution of S3 storage. He explores the revolutionary functionalities of S3 Tables and Vectors, crucial for modern data management and analytics. Andy shares insights on how customer feedback has shaped these developments, improving performance for AI workloads. He also discusses the innovative applications of these features in industries like genomics and finance, along with the technical challenges faced in integrating advanced data types.
undefined
37 snips
Jul 28, 2025 • 52min

Revolutionizing Python Notebooks with Marimo

In this conversation, Akshay Agrawal, Co-founder and CEO of Marimo, introduces a groundbreaking open-source Python notebook environment. He tackles the drawbacks of traditional Jupyter notebooks, such as hidden states, and showcases Marimo’s reactive execution model and improved interactivity. Akshay also discusses the tool's capability to seamlessly integrate data apps, compared to other platforms like Jupyter and Streamlit. The talk highlights the technical architecture, community-driven development, and exciting future plans, including AI enhancements, aiming to revolutionize data workflows.
undefined
20 snips
Jul 21, 2025 • 55min

Warehouse Native Incremental Data Processing With Dynamic Tables And Delayed View Semantics

Dan Sotolongo, a principal engineer at Snowflake, shares insights on simplifying data engineering through incremental data processing and delayed view semantics. He dives into the complexities of managing evolving datasets in cloud warehouses, discussing how these concepts optimize resource use and reduce latency. The conversation contrasts traditional batch systems with dynamic tables and streaming solutions, emphasizing the need for a unified framework for semantic guarantees in data pipelines, and highlights the ongoing innovations in data integration and maintenance.
undefined
23 snips
Jul 15, 2025 • 52min

Streamlining Data Pipelines with MCP Servers and Vector Engines

Kacper Łukawski, a Senior Developer Advocate at Qdrant, specializes in vector databases for large language models. He dives into transforming unstructured data into valuable insights through semantic search and retrieval-augmented generation. Kacper explains the integration of MCP servers for optimizing data pipelines and discusses the challenges of managing embeddings. He also highlights innovative applications in coding practices and the complexities of vector search, offering practical advice on fine-tuning models and reducing costs for enhanced search quality.
undefined
107 snips
Jul 6, 2025 • 55min

Foundational Data Engineering At Two Sigma

Effie Baram, a leader in foundational data engineering at Two Sigma, shares her insights into the pivotal role of data in finance. She delves into the complexities of maintaining data quality while ensuring quick delivery, navigating the socio-technical challenges of building a robust data platform. Effie discusses treating data as code, leveraging modern data warehouses, and the evolution of data engineering roles. She emphasizes the transition from fixed schemas to flexible structures, underscoring the importance of collaboration and regulatory compliance in a rapidly changing landscape.
undefined
63 snips
Jun 29, 2025 • 54min

Enabling Agents In The Enterprise With A Platform Approach

Arun Joseph, an AI engineering leader and entrepreneur, discusses his journey in developing multi-agent systems. He emphasizes the transformative potential of agentic capabilities in businesses and shares insights on building robust data models and orchestration loops. Arun tackles the challenges of managing large-scale data contexts, the importance of unified context management to avoid silos, and the shift toward open-source platforms like LMOS. He also explores how these innovations can enhance decision-making and streamline enterprise data management.
undefined
33 snips
Jun 18, 2025 • 1h 2min

Dagster's New Era: Modularizing Data Transformation in the Age of AI

Nick Schrock, CTO and founder of Dagster Labs, shares his expertise on the transformational role of AI in data engineering. He discusses the importance of maintaining core data principles amid AI advancements and emphasizes the necessity of human oversight. Nick introduces Dagster's innovative components feature that modularizes data transformations, enhancing collaboration. The conversation also covers the balance between simplifying tools for non-technical users and maintaining customizability, addressing key challenges in the evolving data landscape.
undefined
13 snips
Jun 11, 2025 • 44min

AI and the Lakehouse: How Starburst is Pioneering New Workflows

Alex Albu, tech lead for AI initiatives at Starburst, dives into the fascinating world of integrating AI workloads with lakehouse architecture. He shares his journey from software engineering to championing AI enhancements at Starburst. The discussion covers innovative solutions like AI agents for data exploration and metadata enrichment. Alex addresses the hurdles of marrying AI with traditional data systems and reveals future visions for improved data formats and AI-driven tools, promising a revolution in data management.
undefined
63 snips
Jun 3, 2025 • 1h 1min

Amazon S3: The Backbone of Modern Data Systems

Mai-Lan Tomsen Bukovec, Vice President of Technology at AWS, reveals the remarkable evolution of Amazon S3, a key data repository since 2006. She shares fascinating insights on how S3 transformed from basic storage into the backbone of modern data architecture, enabling scalable data lakes. Discussion includes the importance of metadata, the integration of S3 with Iceberg, and innovations like strong consistency. Companies like Adobe and Netflix leverage S3 for efficiency, showcasing its role in navigating both structured and unstructured data challenges.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app