
Data Engineering Podcast
This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Latest episodes

14 snips
Nov 27, 2023 • 30min
Addressing The Challenges Of Component Integration In Data Platform Architectures
In this podcast, the host discusses the challenges of integrating components in data platform architectures, including user experience, data sharing and delivery, and shadow IT. They explore event-driven pipelines, access control, data flow ownership, and metadata propagation. The importance of reliable integrations and extensible systems is emphasized, along with tools like Open Lineage and DBT. Python and open metadata platforms are highlighted for simplifying integration and managing permissions and roles across data tools.

53 snips
Nov 20, 2023 • 1h 16min
Unlocking Your dbt Projects With Practical Advice For Practitioners
Learn practical advice for building and scaling dbt projects, including adopting and using dbt, complexities of data lakes, challenges with YAML in dbt projects, scaling dbt projects, and importance of planning and structure in dbt.

Nov 13, 2023 • 1h 8min
Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine
Eran Yahav, founder of Tabnine, discusses the journey of building an AI assistant for software engineers. Topics include advancements in AI code completion, the usage and effectiveness of Tabnine, challenges of customizing generative AI for software engineering, and future directions for Tabnine.

Nov 6, 2023 • 55min
Shining Some Light In The Black Box Of PostgreSQL Performance
Lukas Fittl, a database performance expert, discusses performance bottlenecks in PostgreSQL, tools like 'explain' in PostgreSQL, common optimization challenges, and the importance of tuning configuration parameters. He also shares insights on the development of PG analyze, enabling performance settings in PostgreSQL, and the evolution of database engines.

Oct 30, 2023 • 47min
Surveying The Market Of Database Products
Tanya Bragin, an experienced product manager for major vendors, shares insights on how to approach tool selection in the database market. Topics include open-source technologies, trends in database market, challenges in data projects, importance of data observability tools, and future trends.

31 snips
Oct 23, 2023 • 1h 4min
Defining A Strategy For Your Data Products
Ranjith Raghunath shares his thoughts on building a strategy for data products, including centralizing vs decentralizing data product strategy, managing technical debt, and the importance of metrics in data product strategy.

8 snips
Oct 15, 2023 • 1h 8min
Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable
This podcast episode discusses the challenges of building and maintaining stream processing infrastructure. It explores the evolution of streaming systems and the role of platforms like Decodable in simplifying stream processing. The speakers also delve into the challenges of stream processing applications, different ways to interact with Decodable, the importance of 'glue' in stream processing, and the biggest gap in data management tooling and technology.

Oct 9, 2023 • 52min
Using Data To Illuminate The Intentionally Opaque Insurance Industry
Max Cho, Founder of a business to make policy selection more navigable, discusses the challenges of navigating the opaque insurance industry. Topics include data collection and analysis, automating a manual industry, insurance pricing transparency, challenges of AI navigation, data preprocessing for analysis, understanding policy complexities, and the utility of large language models in the insurance industry.

21 snips
Oct 1, 2023 • 52min
Building ETL Pipelines With Generative AI
AI's impact on ETL processes, using generative AI for unstructured data, AI's role in ETL pipelines, experimenting with AI models, evolving role of AI assistants in data engineering, considerations and challenges of using AI in ETL pipelines, changing landscape of ETL tools

Sep 25, 2023 • 59min
Powering Vector Search With Real Time And Incremental Vector Indexes
This podcast discusses the growth of machine learning and the need for vector search capabilities. They explore the challenges of real-time indexes, the benefits of semantic search, and incorporating vector search into data flows. They also cover the considerations and limitations of vector search and share insights on working with vector databases.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.