

Revisiting The Technical And Social Benefits Of The Data Mesh
01:10:53
Data Mesh Principles
- Data mesh decentralizes data ownership and sharing for analytics.
- It uses domain-oriented teams and treats data as a product.
Evolving Principles
- The core data mesh principles remain unchanged since their inception.
- A fourth principle, federated computational governance, was added later.
Modern Data Stack and Data Mesh
- Modern data stack tools improve data engineers' lives but lack application developer integration.
- Data mesh needs tools that bridge the gap between application development and analytics.
Get the Snipd Podcast app to discover more snips from this episode
Get the app 1 chevron_right 2 chevron_right 3 chevron_right 4 chevron_right 5 chevron_right 6 chevron_right 7 chevron_right 8 chevron_right 9 chevron_right 10 chevron_right 11 chevron_right 12 chevron_right 13 chevron_right 14 chevron_right 15 chevron_right 16 chevron_right 17 chevron_right 18 chevron_right 19 chevron_right 20 chevron_right 21 chevron_right 22 chevron_right 23 chevron_right 24 chevron_right 25 chevron_right
Introduction
00:00 • 2min
The Data Mesh - A Brief Introduction
02:06 • 2min
The Challenges of Complexity and Data Management
04:10 • 5min
The Four Principles of Scaled Solutions
08:53 • 4min
The Rise of the Modern Data Stack
12:35 • 4min
Data Analytics and Application Development - Is There a Need to Connect It?
16:49 • 2min
Imbedded Analytical Solutions - The Foundational Need
19:03 • 1min
Data Base Innovation
20:17 • 5min
The Integration Between Data Moding and the Source Is Already There
24:48 • 2min
Data Fold - Data Engineering Podcast Dot Com
26:59 • 2min
Application Development Frameworks - Data Mesh
29:22 • 2min
Data Product Creation
31:35 • 1min
Data Mesh - A New Approach to Data Mining
32:58 • 3min
Datamish
36:23 • 3min
Data Quantum - The Building Block of the Mesh
38:54 • 5min
The Hard Engineering Disciplines
44:05 • 2min
Using a Data Catalogue, Is It a Good Idea?
45:37 • 3min
Data Mesh
48:49 • 4min
The Evolution of the Data Mesh
53:08 • 2min
The Impact of Data Mesh in the Application Development Cycle
54:54 • 3min
Optimization of the User Experience
57:47 • 3min
What Are Some of the Most Unexpected or Challengesome Lessons That You've Learned?
01:00:39 • 3min
Data Meshes the Wrong Approach to Day
01:03:13 • 3min
Data Mesh and the Community?
01:05:47 • 2min
Data Strategy and Execution - Part 4
01:08:10 • 2min
Summary
The data mesh is a thesis that was presented to address the technical and organizational challenges that businesses face in managing their analytical workflows at scale. Zhamak Dehghani introduced the concepts behind this architectural patterns in 2019, and since then it has been gaining popularity with many companies adopting some version of it in their systems. In this episode Zhamak re-joins the show to discuss the real world benefits that have been seen, the lessons that she has learned while working with her clients and the community, and her vision for the future of the data mesh.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
- Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription
- Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold.
- Your host is Tobias Macey and today I’m welcoming back Zhamak Dehghani to talk about her work on the data mesh book and the lessons learned over the past 2 years
Interview
- Introduction
- How did you get involved in the area of data management?
- Can you start by giving a brief recap of the principles of the data mesh and the story behind it?
- How has your view of the principles of the data mesh changed since our conversation in July of 2019?
- What are some of the ways that your work on the data mesh book influenced your thinking on the practical elements of implementing a data mesh?
- What do you view as the as-yet-unknown elements of the technical and social design constructs that are needed for a sustainable data mesh implementation?
- In the opening of your book you state that "Data Mesh is a new approach in sourcing, managing, and accessing data for analytical use cases at scale". As with everything, scale is subjective, but what are some of the heuristics that you rely on for determining when a data mesh is an appropriate solution?
- What are some of the ways that data mesh concepts manifest at the boundaries of organizations?
- While the idea of federated access to data product quanta reduces the amount of coordination necessary at the organizational level, it raises the spectre of more complex logic required for consumers of multiple quanta. How can data mesh implementations mitigate the impact of this problem?
- What are some of the technical components that you have found to be best suited to the implementation of data elements within a mesh?
- What are the technological components that are still missing for a mesh-native data platform?
- How should an organization that wishes to implement a mesh style architecture think about the roles and skills that they will need on staff?
- How can vendors factor into the solution?
- What is the role of application developers in a data mesh ecosystem and how do they need to change their thinking around the interfaces that they provide in their products?
- What are the most interesting, innovative, or unexpected ways that you have seen data mesh principles used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on data mesh implementations?
- When is a data mesh the wrong approach?
- What do you think the future of the data mesh will look like?
Contact Info
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
- Data Engineering Podcast Data Mesh Interview
- Data Mesh Book
- Thoughtworks
- Expert Systems
- OpenLineage
- Data Mesh Learning
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA