Data Engineering Podcast

Tobias Macey

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Episodes

Mentioned books

83 snips

Feb 18, 2024 • 59min

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

The guest, Jignesh Patel, discusses his research on technical scalability and user experience improvements in data management. They explore the challenges of meeting data demand, the limitations of Moore's Law, efficient data retrieval and indexing, strategies for real-world context, and future problems and challenges in complex systems and data processing. The guest also highlights the importance of data discovery in data management technology.

6 snips

Jan 1, 2024 • 48min

Designing Data Platforms For Fintech Companies

CTO of fintech startup Monite discusses designing and implementing data platforms for the complexities of working with financial data. Topics include data challenges, regulatory requirements, managing backups, machine learning in fintech, reshaping accounting and customer support, and data governance challenges.

Dec 24, 2023 • 1h 15min

Troubleshooting Kafka In Production

Elad Eldor, author of 'Kafka: Troubleshooting in Production', discusses the challenges of operating Kafka at scale and ways to mitigate potential issues. Topics include the importance of Kafka in the data pipeline, doubling retention in Kafka, managed vs. self-managed Kafka clusters, data lake complexity, monitoring for Kafka, troubleshooting unreplicated partitions, the cost of running Kafka in the cloud, and the need for a correlation tool.

Dec 18, 2023 • 56min

Adding An Easy Mode For The Modern Data Stack With 5X

The podcast discusses the challenges of the modern data stack and how 5X is pre-integrating the best tools from each layer to solve these issues. It explores the need for a centralized control plane, strategic investments in data capabilities, and the benefits of consolidating to a single solution. The speaker also shares insights on platform building, simplifying user experience, and lessons learned in introducing a new product category.

Dec 11, 2023 • 51min

Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack

The podcast discusses the Anomstack project and its goal of providing simple anomaly detection. They explore the definition and prioritization of metrics, implementation and optimization of AMSTAC, and extending the project's capabilities. They also touch on data lakes, Starburst analytics, and challenges in data management.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Data Engineering Podcast

Episodes

Mentioned books

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Sharing Across Business And Platform Boundaries

Tackling Real Time Streaming Data With SQL Using RisingWave