

Data Engineering Podcast
Tobias Macey
This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Episodes
Mentioned books

50 snips
May 19, 2024 • 54min
Zenlytic Is Building You A Better Coworker With AI Agents
Zenlytic is revolutionizing business intelligence systems by using AI agents that allow users to converse with their data. The podcast delves into the challenges and advancements in generative AI, highlighting the difference between AI chatbots and AI agents. The team discusses the importance of fundamental knowledge in AI models, navigating data lake complexity, and scalability considerations for B2B applications. They also explore the evolving role of AI agents in enhancing text data analysis and business intelligence.

4 snips
May 12, 2024 • 20min
Release Management For Data Platform Services And Logic
Explore the challenges of release management for data platform services and logic, including complexities of testing data pipelines, strategies for data integrity testing, development environment challenges in Daxter pipelines, and the evolution of validation and release management in data systems.

36 snips
May 5, 2024 • 54min
Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach
Peter Voss, a pioneer in cognitive AI, discusses the shift towards human-like intelligence in AI, emphasizing learning over statistical prediction. The podcast explores the evolution from narrow AI to AGI, contrasts generative systems with cognitive AI, and highlights the challenges and benefits of achieving human-level AGI. Voss advocates for maximizing AI capabilities, leveraging open-source resources, and prioritizing transparency and explainability in AI models.

8 snips
Apr 28, 2024 • 50min
Build Your Second Brain One Piece At A Time
Tsavo Knott, creator of Pieces, discusses simplifying AI integration into developer workflows with a powerful collection of tools. He explains data collection, model types, and incorporating Pieces as a second brain. The podcast explores the impact of AI on developer tooling, personalized AI tools, challenges in machine learning, building integrated systems, and enhancing developer workflows with the Pieces tool.

4 snips
Apr 21, 2024 • 54min
Making Email Better With AI At Shortwave
Andrew Lee, Founder of Shortwave, discusses integrating AI into email to boost productivity. He shares technical challenges, benefits, and features of the product. Topics include challenges of email, optimizing AI models, embedding models, email synchronization, and transitioning to an email-focused product.

5 snips
Apr 14, 2024 • 1h 16min
Designing A Non-Relational Database Engine
Oren Eini, CEO of RavenDB, discusses designing a non-relational database engine, comparing it to relational engines. Topics include key design considerations, data modeling approaches, performance differences, and the importance of transactions. They also explore the influence of generative AI on the database market and vector search functionalities, emphasizing simplicity, operational ease, and distributed architecture considerations in database engine design.

9 snips
Apr 7, 2024 • 56min
Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer
Artyom Keydunov, creator of Cube, discusses the evolution and applications of the semantic layer in data platforms. Topics include challenges in defining metrics, implementing a semantic layer, transitioning from DBT to Cube, and the evolution of CubeJS to Rust. The episode also explores AI-driven data discovery tools for business consumers.

7 snips
Mar 31, 2024 • 51min
Adding Anomaly Detection And Observability To Your dbt Projects Is Elementary
Exploring the importance of observability in dbt projects, with focus on enhancing testing capabilities and anomaly detection. The conversation delves into the challenges faced by data engineers in building trust in data accuracy and the approach taken by Elementary to embed observability into the workflow.

23 snips
Mar 24, 2024 • 56min
Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+
Discover how Dagster+ enhances data orchestration with declarative workflows, reducing burden on data teams and enabling collaboration. Learn about the evolution of asset-oriented orchestration, data mesh concepts, and diverse industry use cases. Dive into the sustainable approach to Daxter Plus launch and considerations for choosing Dagster+ in data orchestration.

Mar 17, 2024 • 58min
Reconciling The Data In Your Databases With Datafold
The podcast delves into data reconciliation in databases, discussing error conditions and solutions to ensure data accuracy. Topics include challenges in data management, techniques for maintaining data quality, navigating reconciliation in warehouse migration projects, and strategies for cost management and data optimization. The innovative uses of Datafold and Data Diff utility in various sectors, intersection of data engineering and AI applications, and advancements in tooling support for data engineers are also explored.