Data Mesh Radio

Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh. Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out! Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above. If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/ You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/ Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio

Latest episodes

Jan 6, 2022 • 56min

#12 Data-Centric Application Development and Data Mesh - Interview w/ Dan DeMers

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Dan's LinkedIn: https://www.linkedin.com/in/demersdan/Cloud Information Model Standard: https://cloudinformationmodel.org/Cinchy website: https://cinchy.com/In this episode, Scott interviews Dan DeMers, Co-Founder and CEO of Cinchy, a dataware platform / data fabric provider. Dan shares his thoughts on why data-centric application design is the best way to deal with the challenges of applications and analytics needing the same data for different purposes - the current approach is to let the application schema evolve whenever and however necessary and the underlying data applications suffer. Dan's view is "share access to data, not copies."The interview ties loosely with previous interviews re schema/data contracts and data testing as Dan argues using a dataware approach will prevent the issues of an evolving application schema breaking data consumption downstream.It isn't all rosy as this will take a fair bit of work for an organization to move to this approach. Food for thought and the first of a series of interviews re DDD (domain-driven design) for data and data-centric application development.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jan 2, 2022 • 27min

#11 The Grinch Who Spoiled Christ-Mesh – 1) Why I Don’t Like Reverse ETL; 2) Dog Fooding; and 3) The Generalist Data Modeler – Mesh Musings 3

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.In this episode, Scott discusses three concepts that are at best a concern. Consider it a late Grinch-inspired present for Xmas :)Reverse ETL meets a real need for analytical data being pushed into CRM, marketing, and other similar systems. But treating another pipeline as a first order concern is fraught with the same issues of most similar data pipeline treatment: who owns it, how does it evolve, who is observing/monitoring it for uptime and semantic drift, etc.? Should we look to create data products on the mesh to serve those needs instead of another ETL tool?Some organizations implementing data mesh are forcing their domains to consume any analytics from their own data products on the mesh. The good of this is that it aligns the domain with creating a high-quality data products. But will those data products be designed to fit the general organizational needs or specifically the domain's needs?There is an emerging push for software engineers to also own the data modeling. To get to a place where this is even feasible, don't we need far better abstractions for domains to _do_ the data modeling? And will this overload software engineers that are already dealing with a metric buttload of technologies and requirements already? Where would a junior engineer fit in that kind of organization? Does this mean more software engineers on the team -> 2 pizza teams now 3? 4? 5? 10? Maybe we pump the brakes on this for now?Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jan 1, 2022 • 1h 5min

#10 Ensuring Data Quality via Data Testing and Versioning – Interview w/ Jesse Paquette

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Jesse's contact info:Email: jesse at tag.bioLinkedIn: https://www.linkedin.com/in/jessepaquette/Twitter: @bzdyelnik / https://twitter.com/bzdyelnikWebsite: https://tag.bio/Tag.bio vendor interview for Data Mesh Learning: https://www.youtube.com/watch?v=acQADu7ttqQIn this episode, Jesse Paquette, Chief Science Officer and Co-founder at Tag.bio - a data platform vendor in the life sciences space, and Scott dive a bit deeper into data quality in general, especially data testing and versioning.You can see the LinkedIn post that sparked this discussion hereJesse recommends a number of things to ensure data quality, especially data testing and versioning. This includes versioning of 1) the code used to create the data (generally the ETL code), 2 the schema, 3) the business logic layer, and 4) timestamping / temporality based versioning.Jesse's general calls to action are 1) make data testing frameworks so testing is much less tedious and time consuming; 2) work with stakeholders to gain trust in the data and then continue the dialogue to keep said trust; and 3) create schema/domain model blueprints so that domains have a starting point - whether they use it is irrelevant but shortening the path to a working domain model is crucial.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Dec 28, 2021 • 1h 6min

#9 Data Contracts Deep Dive – The Pains and the Solution(?) – Interview with Abhi Sivasailam

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Abhi's contact info:Twitter: https://twitter.com/_abhisivasailamLinkedIn: https://www.linkedin.com/in/abhi-sivasailam/Abhi's Data Mesh Learning Meetup: https://www.youtube.com/watch?v=-POiudR2_R0Debezium blog post mentioned by Abhi: https://debezium.io/blog/2019/02/19/reliable-microservices-data-exchange-with-the-outbox-pattern/In this episode, Abhi Sivasailam, Head of Growth and Analytics at unicorn startup Flexport, and Scott deep dive into data contracts. They covered a LOT of ground including:What is a data contract and how does it relate to API contracts and schema contractsWhy data contracts are so crucial to treating data like a product and keeping data usable by consumersAbhi's rules for a viable data contractThe importance of the analytics engineer to overall data usefulness, especially re domain data ownership in data meshWhy of the socio of the socio-technical approach to data ownership is the most crucial aspectThe lack of proper tooling to monitor and execute data contractsHow to minimize disruptive changes to downstream data consumersThe issues with domains sharing their data as it is persisted/stored in the database instead of sharing the context via the domain modelMuch, much moreData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Dec 25, 2021 • 1h 7min

#8 Platform Re-use, PoC Advice, and More Data Mesh Nuggets – Interview with Matthew Darwin

Matthew Darwin, a Principal Data Engineer at Slalom Consulting, shares his expertise on platform re-use and data ownership. He emphasizes the potential of leveraging existing technologies when building data platforms. Matthew discusses the journey of implementing data mesh, highlighting that perfection isn’t necessary from the outset. He shares insights on balancing culture and technology, as well as the value of collaboration in data management. His practical advice helps demystify the transition to decentralized data practices.

Dec 23, 2021 • 55min

#7 Data Schema Contracts – What are they and why they matter – Interview with Olivier Wulveryck, Senior Consultant @ OCTO

Olivier Wulveryck, a Senior Consultant at OCTO, dives deep into data schema contracts and their vital role in the data mesh framework. He explains how these contracts ensure data compatibility and enhance trust among users. Olivier emphasizes the importance of versioning and structured communication to maintain data integrity. The discussion also highlights the need for treating data as a product and fostering a culture that prioritizes data quality and governance. Listeners are encouraged to engage with the community for shared learning in this evolving field.

Dec 22, 2021 • 1h 2min

#6 All About Data Products – Interview with Wannes Rosiers, CTO Golazo Group

Wannes Rosiers, the CTO of Golazo Group and former leader of DPG Media's data mesh transition, shares his insights on data products. He discusses his framework for categorizing data products and emphasizes the need for a clear purpose in their creation. Wannes highlights the evolution of data products, advocating for ownership and adaptability in data strategies. He also dives into the concept of reverse ETL as a product and the importance of interoperability across domains. Lastly, Wannes encourages community engagement for growth in the data field.

Dec 22, 2021 • 40min

#5 Data Virtualization and Data Mesh, Spec Data Products, and Can You Keep Your Existing Architecture – Mesh Musings 2

Matthew Darwin, an insightful author with expertise in data platforms and architecture, joins to explore the intricacies of data mesh systems. They discuss the pitfalls of data virtualization for managing data products, emphasizing the need for genuine decentralization rather than superficial changes. The conversation also introduces speculative data products as a clever way to gather consumer feedback, underscoring the importance of consumer-focused documentation in enhancing data engagement and refining offerings.

Dec 13, 2021 • 7min

#4 Be A Guest! Feedback Welcome! A Call To Action

The hosts are eager to welcome new voices to their discussions on Data Mesh. They emphasize the importance of diverse perspectives in shaping conversations. Listeners are encouraged to provide feedback and suggestions for content that resonates with the community. The call to action invites anyone interested in sharing their insights to join the show. Engage and help create valuable resources for everyone involved!

Dec 13, 2021 • 55min

#3 Discover and Create Your Necessary Data Products - Data Product Flow Interview w/ Paolo Platter

In this engaging discussion, Paolo Platter, CTO at Agile Lab and a data product development expert, shares insights on migrating existing data products to a data mesh framework. He explores how businesses can identify necessary data products that align with organizational goals. The conversation delves into event storming for data solutions, the importance of tailored implementation plans, and measures for success in data product lifecycle management. Listeners will also discover complex pricing models for data products and how Agile Labs can assist in effective data mesh strategies.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app