
Data Mesh Radio
Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh.
Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out!
Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing
Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above.
If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/
You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/
Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio
Latest episodes

Aug 1, 2022 • 1h 21min
#107 Focusing on Outcomes and Building Brave Teams in Data - Interview w/ Gretchen Moran
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Gretchen's LinkedIn: https://www.linkedin.com/in/gretchenmoran/NGS' current openings: https://ngs.wd1.myworkdayjobs.com/ngs_external_career_siteIn this episode, Scott interviewed Gretchen Moran, the Senior Director, Data Products at the National Geographic Society (NGS; the non-profit arm of National Geographic).Some key takeaways/thoughts from Gretchen's point of view:NGS is a bit unique in that they don't have a widely deployed data architecture so they do not have a lot of habits to unlearn. Starting with a greenfield means likely more training and learning/experimenting will be required but at least no institutional unlearning.To move forward with data mesh, organizations must be able to embrace change - and the pain that it will inevitably bring - and embrace ambiguity. You need to move forward and figure it out together but also be okay with failure as a learning experience as you test what works for your organization.To win the hearts and minds of data producers, show them what high-quality data can mean for the organization and their domain/role. Work closely with them, understand their context, hold their hand to bring them along and align them to the vision of data mesh.It's easier to drive buy-in widely if you find the organizational influencers and win them over. It is the domino effect in practice. Partner closely with the influencers early on to drive your initiative forward.For NGS, they are working with a single initial data producing team for their proof of value. The data mesh world seems to be split a bit between working with one or two to three teams in the initial proof of value stage."Any technology effort is still a people effort."We have yet to learn how to leverage the knowledge and context of people without data knowledge in general in the data and analytics space. This is what data mesh tries to unlock but we are still figuring out how to do it well.It's very easy to intimidate people with data. We need to make tech and especially data much less intimidating to push broader adoption. The business context of those who aren't yet data literate can be extremely valuable. We need to lower the actual bar to leveraging data but also lower the perceived bar to leveraging data."Metrics + outcomes = value" - without outcomes attached, metrics have no value.Automation is going to be key to many aspects of data mesh. Upskilling people to leverage data will only really pay off if it doesn't mean a large increase in the amount of work to leverage data.User experience is crucial to getting the most value out of your data. Think about your data user experience (DUX) and bring in designers to help optimize the experience and really focus on data as a product thinking.NGS is still trying to find who should own generating and sharing insights on data combined from multiple domains. Is that a centralized insights team? Does that push us too far back towards centralization? It's still early days but those insights are crucial to driving value from data.We will see where new insights come from in data mesh. Will it be more insights from data consumers as they can spend time on data analysis instead of data cleaning? Or will it be the data producers as they learn how to really leverage their own data? Both?Many are waiting for vendors to validate data mesh but they really haven't done that very well yet. It remains to be seen if they even could validate something so complex and large in scope.Building brave teams - teams that aren't afraid of new challenges or of failure or especially of ambiguity - will be crucial to getting data mesh right. Teams might be afraid of doing things "wrong" but as long as they work to get to a good state and incorporate feedback and learnings, that's what will drive much more value in the long-run.There is really shared ownership of data even inside the same domain. The subject matter experts and the people shaping the data products must build a strong relationship with good communication.Data mesh aims to - and needs to - solve for the cost of change in data. Traditional data warehouses have had an extremely high cost of change. The huge cascading pipeline setups most have with data lake too. Data mesh needs to make evolution in data much easier, quicker, safer, less costly, etc. It's still early days there.There is a delicate balance between over architecting and underinvesting in your platform. Look to build for reuse and don't lock yourself into decisions where possible. Far easier said than done.Many teams are worrying if they are doing data sharing wrong. But can they actually really do it wrong? Yes, probably, but if they are open to feedback and paying attention, they don't have to get it "right" the first time. You can evolve to get to a very good place but prior data setups have been so rigid, that evolution has been tough and costly.You probably don't need to build out as large of a team as you might think to start on your data mesh journey - depending on your timeline. It will take years to get to delivering fully on your vision but you can add a lot of value as you progress and learn - you don't have to get it perfect at the start! And consider what skillsets are really crucial.Gretchen started off by giving her background and some of the ways her history has played into her perspective and current role. A big factor in her interest in data mesh was helping a number of large organizations evolve their data platforms and how that helped those organizations deliver better results - but they still often struggled somewhat to derive full value from their data and data mesh can hopefully unlock that value.For the National Geographic Society relative to data mesh, preparation has been Gretchen and team's keyword. Rather than trying to move forward with their data mesh implementation as fast as possible, they've spent the last year testing and preparing for implementing their data strategy. And they are in a bit of a unique situation because even though the organization is over 100 years old, really their tech stack is about seven years old and they don't have a cohesive data architecture deployed. This means they also don't have a lot to unlearn but have a ton to learn and experiment on.NGS is already organized in a product-centric approach and the technologists really understand their domains. Now, Gretchen and team just need to get them bought in that they should treat their data like they do their applications - like a product - as they move forward with their data mesh implementation. Easier said than done but the organization in general hasn't been pushing back on these ideas, which has meant good initial collaboration.While embracing sharing data is crucial to NGS' overall organization-wide strategy, it's not all sunshine and rainbows as Gretchen knows there will be a lot of heavy lifting. Heavy lifting around change, heavy lifting in going against the status quo. To be successful with a data mesh implementation, the organization is going to have to embrace change and ambiguity. And both are typically painful.Gretchen and team knew to lay the groundwork for something like data mesh, their organization needed a base layer of data literacy. Without an understanding of data, would they even have people to consume data, much less people capable and willing to produce their data like a product? So they started by bringing in consultants to help people start learning and to build a general business glossary.But to really reach the "hearts and minds" of the general organization, Gretchen knew they needed to show people what value data can bring them. What are their goals and needs and how can data support that? How is something like having high-quality data available valuable to data consumers but more tricky, how is it also valuable to data producers? And part of their data literacy/upskilling process was showing people what using data could mean for them, not just a training course in SQL or Tableau. Just training people how to use a piece of technology in a vacuum hasn't worked well. A success vector for Gretchen has been finding the organizational "influencers" that provide the leverage to drive buy-in across the org.So, how are Gretchen and team getting going after their preparation period? They are partnering very closely with an initial pilot team and are going to prove out the value to share more broadly across the organization. This has been an interesting question in data mesh - how many domains and/or data products should be involved in your proof of value?Broadly speaking, Gretchen believes - and Scott agreed - any technology effort is still very much a people effort. It is very hard to do something like make data self-evident so the people need to steer and steward any technology effort to do something like that forward. Then add in the fact that data mesh is much more organizational/process focused and the people side becomes even more crucial.Gretchen talked about metrics in general and her theory of the "bad metrics sin" - that it is worse to have bad metrics than no metrics at all. And to identify early and then stay away from vanity metrics. She strongly believes that metrics + outcomes = value so without the outcomes, metrics don't have value. As Sadie Martin and Katie Bauer mentioned in their episodes, measure what matters and measure what you will act on. And measuring impact in the NGO (non-governmental organization; a term for many non-profits) space is particularly difficult - Gretchen used the word persnickety - so really finding your useful metrics and backing them up can be a challenge but is crucial.One behavioral change Gretchen is pushing heavily as people are learning more and more about data - asking about people's rationales when making a choice working with data. The outcome is more context for all involved because people use their context to make choices so learning why they made choices can highlight very interesting points. Why did they go with X versus Y? It's crucial to do this to enhance curiosity and learning rather than asking people to prove their reasoning/understanding. So ask with the tone and goal of "tell me more".And it's easy and quite common to intimidate people with data, per Gretchen. We need to lower the actual bar to leveraging data but even moreso, we need to lower the perceived bar of how challenging it is to leverage data. Part of doing that is meeting people where they are, showing them how they can leverage their current knowledge and skills while upskilling them to be even more effective.Gretchen is seeing people in NGS going through so many hoops to produce reports and data in very manual ways; so enabling them to do produce and consume data automatically and more reliably is something she's excited to take on. That way, NGS can leverage their knowledge and skills without the manual effort - allowing them to focus on the value-add aspect of working with data, the insights and how to act on them.User experience (UX) is really crucial to everything NGS does, per Gretchen. Their product managers spend a lot of time to really understand the business aspect of things, not just the software pieces. So they now need to learn how to do the same with data. Product thinking is crucial to getting data mesh right, not just creating data products. How can we move to sharing actual insights instead of just data? And especially, who owns creating and sharing insights on data combined from multiple domains?For Gretchen, it will also be interesting to see what additional insights can be generated when we focus on keeping data clean from the start, not just cleaning it up after the fact by data consumers. What additional insights might come from people actively monitoring the collection and processing of information? Who will generate the new insights? Will it be the traditional data consumers who can now spend the time to work with the data instead of clean it? Or will more insights flow from the data producers as they really get their arms around their own data? The answer is probably both.Building brave teams - teams that aren't afraid of new challenges or of failure or especially of ambiguity - will be crucial to getting data mesh right in Gretchen's view. People have to welcome change and understand that while change is painful, there is a point and purpose for it. Give them the understanding of what the change is for, what is the reasoning.Gretchen and team are trying to ensure they aren't over architecting the data platform, putting in too much work too early and locking themselves into choices if there isn't a need. But then it is quite easy to underinvest and not provide what people actually need. So she's really focused on making the platform robust enough but not too rigid or expensive. It's a hard needle to thread.Many teams are worrying if they are doing data sharing wrong in NGS, per Gretchen. But can they actually really do it wrong? Yeah probably, but if they are open to feedback and paying attention, they don't have to get it "right" the first time to get it right eventually. You can evolve to get to a very good place - prior data setups have been so rigid where change has been extremely painful so that evolution has been tough. Data mesh needs to solve for lowering the cost and fear of change in data but it's still early days.Gretchen doesn't think you need to build out a huge team to do data mesh, or at least to get moving. Her team's approach is to build a reusable base for generating and managing mesh data products and have a few data architects to keep moving things in the right direction. Then, they have the team and the drive to teach developers how to manage data as a product and get them bought in that it's necessary to do so.Some rapid-fire insights from Gretchen to wrap up:We have yet to learn how to leverage the knowledge and context of people without data knowledge in general in the data and analytics space. This is what data mesh tries to unlock but we are still figuring it out.There are good incentives for teams to produce high quality and reliable data but you have to work with them closely to explain it.The concept of data lake was to only invest in cleaning and maintaining the data when there was a clear use-case, a clear reason to invest that time. But it was clean-up, not proactive cleaning, and typically had opaque and/or mediocre ownership - that made it much harder to derive the value.Vendors have yet to really validate data mesh and that means many folks are still sitting on the sidelines. It will be interesting to see if vendors really can ever validate it given how complex and large in scope data mesh really is.To do data mesh right, many stakeholders need to parse the principles - or at least what the principles are trying to achieve - and then, crucially, adapt them to your culture. Data mesh can't be about cutting and pasting from someone else's implementation.Shared ownership of data is very hard. That seems obvious but even within the domain, there is shared ownership between the subject matter experts and those shaping the data to share externally. There needs to be strong communication and a good relationship between those parties.Really spend the time to consider what skillsets you actually need. And when you will need them. It's okay to have more basic data products in the early days of a mesh implementation as developers learn how to work with data properly.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 31, 2022 • 29min
Weekly Episode Summaries and Programming Notes - Week of July 31, 2022 - Data Mesh Radio
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 29, 2022 • 1h 8min
#106 Building an Effective Data Strategy: Why oh Why Don't You Start with the Why - Interview w/ Liz Henderson
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center hereLiz's LinkedIn: https://www.linkedin.com/in/lizhendersondata/Liz's website: https://lizhendersondata.wordpress.com/In this episode, Scott interviewed Liz Henderson AKA The Data Queen, Executive Advisor at Capgemini. To be clear, Liz was only representing her own views on the podcast.Some high-level takeaways/thoughts from Liz's view:To drive buy-in and engagement in a data strategy - especially with people outside the IT/data team - focus on the "why". Why are you doing this initiative or approach? What business goals is it supporting?Also on driving buy-in for a data initiative, start by listening instead of selling/pitching. Focus on the business needs and work backwards to show how data can help address those needs.To be successful with a large change-management data initiative, you need the patience, leadership, courage, and will to push forward. And the budget - don't forget the budget :DYou can't have an effective data strategy if it isn't directly tied into the business strategy. Really consider how data can help to support and execute on the business strategy. Data strategy in a vacuum away from the business is a recipe for trouble.It's very easy to get overly focused in data on what you are delivering instead of why you are delivering it and who it is supposed to serve. If you want to be successful, you need to focus on the latter two. And look to deliver continuous incremental value rather than a back-end loaded value delivery.Change management is very easy to get wrong in data. Really consider if you can not only get the ball rolling, but keep it rolling and in the right direction to implement a large change. Loss of momentum can mean loss of funding.If data mesh follows a similar pattern to data literacy, it's likely to be 3-4 years from initial large swell in hype around data mesh - whether that was late 2021 or more now - until we really see a clear picture of how more organizations have implemented. There needs to be a time for trial and error.Data literacy can only get you so far - you need data storytelling and visualization as the business people need to be able to understand what the data is saying to drive decisions.The 3 most common ways organizations go wrong with data are 1) technology-first / expecting to buy your way to a solution to challenges; 2) not asking the "why" questions; and 3) not having a data strategy.Liz started the conversation talking about how data is an asset to the company - not treating data as an asset, an important differentiation - and started with a common theme for the conversation: "why?". Why is data important to the business? Why are we doing the data work we are doing? What is the purpose?When speaking with company executives, especially those outside the IT/data team, Liz starts every conversation by digging into what is the business trying to do. What is the business strategy? How does data currently play into that strategy? What role do they want data to play in the business strategy? What role might data play if everything were perfect? You can't have an effective data strategy without it tying directly to the business strategy.When considering any aspect of a data strategy, whether that is data mesh or something else, Liz again recommends starting from the why. In some conversations, IT leaders have said "we want to do data mesh" but when Liz dug deeper, they couldn't answer why. Huge red flag. So when considering a data initiative, think about what is the target impact of the work. How would executing on the data strategy impact the business, not the IT environment? Always ask "so what?". Why is this the right path forward for the business and how is data pushing the business results forward?For Liz, if you are speaking with executives outside of the IT/data team about a data initiative like data mesh, to drive buy-in, don't start by selling, start by listening. What are the challenges they are having and work backwards to address those challenges through data. What are the business needs and wants, and how can data help them to address them? That should heavily inform your data strategy.From experience, Liz knows it is quite easy to deliver a very large, costly data initiative that no one really uses or benefits from, AKA a "white elephant". So people play probably an even bigger role in data initiative and data strategy success than most would assume. The cultural aspect is crucial to doing something like data mesh well - if the data consumers still only want to use spreadsheets, it doesn't matter if you are delivering the best data products in the world, there won't be the demand to make the work worth it.Liz gave a specific example of a recent conversation with a company wanting to do data mesh when talking with IT leaders. When she started to dig in, the business wasn't involved at all with the decision to do data mesh. And a big part of doing data mesh is, you know, the business teams owning the data. It's the first data mesh principle! So if you run across a situation like this, you should ask why they want to implement without the business' involvement. Will that be change for the sake of change instead of driving business results?Patience, leadership, courage, and will are all necessary to effectively execute a data change management initiative or overarching data strategy in Liz's view. And don't forget the budget - both for getting going and for the continued improvement and maintenance. It's often easy to get data change started but maintaining the momentum, especially in the right direction, can be quite difficult. And if momentum starts to falter, budget can go away quickly in many orgs. Really consider if you are ready for a large-scale change before moving forward. As Zhamak has said many times, "Think big. Start small. Move fast."Liz shared her insights into how it often takes 3-4 years for a new, large-scale approach to how people work in data to go from everyone talking about it until we see how people are actually implementing it. Data culture and data literacy were all the rage 3-4 years ago but there wasn't much info about how to actually implement a data literacy strategy - we are just starting to see adoption stories being shared. It might take that long for data mesh.So what happened with data literacy where it is now relatively widespread? For Liz, a lot of it was the general data and analytics industry maturing with a strong general awareness of the concept and need for data literacy rather than any one point or push. With data literacy, employees can understand how data impacts their role and thus how it impacts the business. So that is possibly how data mesh might evolve - broad awareness and the brave bleeding edge folks helping to mature the concept and find the useful patterns and anti-patterns but it still takes quite a while.Circling back on how crucial the people aspect is for data mesh - or any data initiative - Liz is aligned with the research that people are the silent success or failure point for most data initiatives rather than technology or architecture. Technology-led initiatives in data are quite likely to fail.Liz recommends that companies looking to be more "data driven" or "data informed" really show to employees how data impacts the overall business and encourage them to consider "the art of the possible" relative to data. And consider ways to take feedback and data requests to enhance the business but not in a single data run for one person type of way, but new data products to share that information with many more people. That way, people know that their insights might actually mean something and might result in a new service or opportunity. And it encourages people to speak up more about what might be interesting additional information to have, leading to better data products.Data storytelling and strong visualization are an often overlooked part of doing data right per Liz. Having that data translator is crucial to take the information and make it so 1) people can understand what the data is telling us - the insight - and 2) what that might mean - the potential action. Just sharing information without the understanding is overwhelming and confusing.Liz wrapped up her thoughts with 3 key points on where people go wrong in data: 1) using a technology-first approach - stop picking solutions to try to be a silver bullet, really consider what you are trying to do first; 2) not asking the why, especially about "why do we want to be data driven"; and 3) not having a data strategy at all - you need a compass to move forward.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 26, 2022 • 9min
#105 Data Modeling in Data Mesh Part 1 - Mesh Musings 23
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 25, 2022 • 1h 4min
#104 How Does Data Mesh Impact the Business: Learnings from T-Mobile Polska's Early Journey - Interview w/ Karolina Henzel
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Karolina's LinkedIn: https://www.linkedin.com/in/karolina-henzel/In this episode, Scott interviewed Karolina Henzel, Data Enablement Tribe Lead at T-Mobile Polska. FYI, businesses and domains are fundamentally similar in this conversation and are used essentially interchangeably.Some high-level takeaways/thoughts/summarizations from Karolina's view:Business transformation and impact is what really matters. Digital transformation is just a mechanism to transform your business into being more digital native and focused. Data transformation is just a part of digital transformation. Transformation should all come down to driving positive business impact.To drive something like data mesh forward, you really need top management support, likely a C-level executive sponsor. Otherwise, it is very easy for work to get deprioritized and pushed out.Don't take on a large-scale data initiative unless there are specific business challenges to address. Don't do data mesh for the sake of "being data driven"; what are the issues and why will addressing them help your business? Explicitly define the problems and the pain points.To drive change, look for "change agents" in the domains. They are people with the will and capabilities to drive large-scale change. They aren't always easy to find but once you start to identify them, patterns will emerge.The big pain points T-Mobile Polska was facing were: 1) poor/inconsistent data quality; 2) data discovery difficulties; and 3) slow time-to-market for new data and insights.T-Mobile Polska was able to move forward with data mesh because business representatives in the domains were bought in that addressing the data pain points would drive incremental business value - there would be a return on the data work investment.Look for quick wins and how to deliver continuous incremental value. If it is all about producing a big bang, you will very likely lose momentum, prioritization, and funding. Continuous value delivery is crucial to keeping people excited about data work.T-Mobile Polska's data quality issues were caused mostly by a lack of accountability/ownership and not adhering to standard definitions across domains and reports.Lack of standard definitions - or at least very clearly differentiated definitions - can cause numbers to not match across reports. And that makes people not trust the data. So drive domains to clearly define terms and, if possible, look to create standard definitions across the organization.Find KPIs that are focused on what actually impacts the business. Dave Colls mentioned fitness functions to break down measuring progress against big challenges into smaller measurements. Karolina and team are making sure the end result is business impact, not technical-only change.You can drive buy-in that data producers should provide high quality data products by showing producers the impact their efforts are having. Can be a chicken and egg issue at first but you can take the results from one domain or business and show them to another to drive buy-in.T-Mobile Polska has a data owner in each domain that has high-level data ownership accountability. There is a data steward in each domain as well that is the subject matter expert. When needed, there is also a technical data steward (embedded data engineer) but it's not necessary in many domains.As a leader in the data governance team at T-Mobile Polska, Karolina has a considerable number of different aspects under her including the data platform, the data warehouse, data quality, etc. Much like data governance as a term has many aspects.Karolina started off the conversation with an important and useful point she emphasized a few times: you need to have specific challenges you want to solve before you embark on something like a data mesh journey. Not just "let's be data driven" - what business challenges are you trying to solve? And look at them from a "why" perspective - why does tackling this challenge matter? Those challenges could be poor data quality, time spent on non-value-added tasks, data discovery issues, etc. Define the problems and the pain points.It's important to understand that digital transformation is about business transformation first and foremost. What is your business trying to achieve? And what are the target/expected business outcomes? More revenue? Cost savings? Etc. You need to define the pain points and what doing something like data mesh will do for the organization to secure cooperation from business leaders. And don't count on patience as you work towards a big value delivery in the future - you need to continuously create "incremental value" along the way.For Karolina, there were three main challenges they needed to address relative to data with their data mesh implementation: 1) data quality was a constant issue - typically stemming from lack of real ownership/accountability and no standard term definitions; 2) data discovery - it was very difficult for data consumers to find data; and 3) time-to-market for new data and insights - the data function was becoming a major bottleneck to the business side.Data governance can't only be an IT problem/challenge, per Karolina. In their data mesh implementation, they are focusing the central governance team on creating the tools and frameworks for the distributed teams to leverage. For instance, the business and technical metadata comes from the domains but the data catalog is offered as part of the platform by the governance team. This separation of duties has allowed quick time to business benefits when bringing on new teams to their data mesh implementation.Karolina and team knew they were facing issues with data so they started interviewing business representatives to ask what were their biggest challenges. The governance team heard repeatedly data quality was an issue but didn't know exactly why they were having data quality issues. So they moved to increase accountability, assigning data owners and data stewards. Collaborating with the owners and stewards, they were able to figure out a few major causes were: a lack of real ownership, no common definitions, no real standard measurement of quality, etc. And addressing those challenges resulted in some quick wins to get positive momentum towards delivering continuous incremental value.At T-Mobile Polska, Karolina has seen how crucial having a C-level sponsor is to succeeding with something like a data mesh implementation. It is very easy to lose prioritization - there is always a more pressing short-term business need than producing high quality data so you need someone that can make sure that data work isn't unreasonably pushed out. Specifically, they created a data governance committee to have strategic supervision of the data governance and data quality efforts and identify the strategic initiatives to continuously deliver incremental value and put things on businesses roadmaps.Scott asked a question he asks many people: what is the reason for creating new mesh data products at T-Mobile Polska? Karolina shared that data products are initially created to serve reporting specifically in most cases. They can expand to serve additional use cases but there is a specific use case in mind for each new mesh data product.Karolina discussed some of the new ways of working and the challenges around the necessary mindset shifts to implement something like data mesh. People were just used to data engineering delivering the data. So producers were used to throwing things over the wall and data consumers were used to making asks to a highly data literate group of people. So, they are inventing new ways of working and processes to not have data engineering handling the communication between teams. Business owners are in charge of explaining why owning and serving their data as a product can add value to their org, what is in it for each person in their own org. One explanation that has resonated well - and been proved out repeatedly - is that by moving to a data mesh way of working, there is a significant reduction in time-to-market for new data and insights including for the producing domain.As part of their data mesh implementation, Karolina and team have been restructuring KPIs to make it possible to measure the impact of the data work they are doing. Their focus is on the impact to the business, not technical focused KPIs. One big goal - with a few proof points thus far - has been a reduction in data work that doesn't add value - reducing the time your data science team spends on things that aren't valuable means they can put more value-add models into production. Another big goal, as previously mentioned, is reducing the time-to-market for new data and insights as many other data mesh implementers are seeing. And Karolina's team is driving buy-in through results by showing data producers how much impact they are having or could have by providing quality data.As for how T-Mobile Polska started their journey, Karolina and team started with laying the foundation for good data governance. They first found the data owners and the data stewards in each business. Then they explained the new responsibilities for those roles and why they were necessary. The data owner is at the Director level, essentially the business or domain owner, and the data steward is more of a subject matter expert. And if there are complicated data needs, that domain needs a technical data steward - an embedded data engineer - as well; but not many domains need a technical data steward. Another thing specifically mentioned was leveraging "change agents", the people with the will and the capabilities to drive large-scale change.Karolina then shared some of the issues they've had with data democratization. Similar to what Ust Oldfield mentioned in his episode, just giving access to data when people don't really understand how to leverage data can do more harm than good. So T-Mobile Polska is pushing the not as data literate people to the data catalog as their only point of interface with data on the mesh; the governance team is focused on enabling producers to create standardized reports and datasets to serve those people. The more technical folks have more options to interface with data with fewer technical guardrails.In wrapping up, Karolina reiterated a few of her main points. 1) Focus everyone on what you are trying to accomplish - what are the priorities? What is the impact to the business? 2) Look to deliver incremental value continuously to build and maintain momentum in your data mesh implementation - without that incremental value, support for your implementation is likely to falter. And 3) C-Level management support is crucial to really drive an initiative like data mesh - without it, your work is likely to get deprioritized and will be continuously pushed out.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 24, 2022 • 27min
Weekly Episode Summaries and Programming Notes - Week of July 24, 2022 - Data Mesh Radio
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 22, 2022 • 1h 13min
#103 4 Years of Learnings on Decentralized Data: ABN AMRO's Data Mesh Journey - Interview w/ Mahmoud Yassin
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center hereMahmoud's LinkedIn: https://www.linkedin.com/in/mahmoudyassin007/In this episode, Scott interviewed Mahmoud Yassin, Lead Data Architect at ABN AMRO, a large bank headquartered in the Netherlands.Some high-level takeaways/thoughts from Mahmoud's view:It's very difficult to do fully decentralized MDM, which led to some duplication of effort - that can mean increased cost and people not using the best data. ABN tackled this through their Data Integration Access Layer - similar to a service bus.They are using that centralized layer - called DIAL - to help teams manage integrations that are both consistently running and on-the-fly. It helps monitor for duplication of work instead of reuse.If Mahmoud could do it again, he'd focus on enabling easy data integration earlier in their journey to encourage more data consumption. Cross domain and cross data product consumption is highly valuable.The industry needs to develop more and better standards to enable easy data integration.Data mesh and similar decentralized data approaches cannot fully decentralize everything. Look for places to centralize offerings in a platform or platform-like approach that can be leveraged by decentralized teams.Most current data technology licensing models aren't well designed for or suited to doing decentralized data - it's easy to pay a lot if you aren't careful - or even if you are careful!A tough but necessary mentality shift is not thinking about being "done" once data is delivered. That's data projects, not data as a product.Try to keep as much work as possible within the domain boundary when doing data work. Of course, cross-domain communication is key but try to limit the actual work dependencies on other domains if possible.A data marketplace enables organizations to more easily create a standardized experience across data products and make data discovery much easier. You don't necessarily have to tie your cost allocation models to the marketplace concept.Sharing what analytical queries/data integration "recipes" people are using has been important for ABN. It drives insights across boundaries and also creates a lower bar to interesting tangential insight creation/development.You should consider not allowing integrations across multiple data products by default. Producers should be able to stop integrations - for compliance purposes or because the integration doesn't actually provide good/valuable/correct insights.Traditional ETL development is about translating the business needs to code. But centralized IT usually can't deeply understand the business context and needs so they deliver substandard solutions. If you consider that business needs evolve, it gets even worse.Mahmoud started his career in data as an ETL developer so he saw the ever increasing issues with the traditional enterprise data warehouse approach in large organizations. Then he moved on to working with the common way people have approached data lakes - managed by a centralized team - and the issues seemed pretty similar to with data warehouses to him. So he was glad to start working with ABN AMRO on decentralizing their approach to data starting about 4 years ago.ETL development is about translating the business needs to code. But Mahmoud saw the same problem many organizations are having - it is very hard for IT to really understand the real business context and needs relative to data. They try, but it often is only on the second or third attempt - if at all - that IT would really understand and get it right. They simply cannot get enough context to serve needs well.So to kick off the discussion, Mahmoud made it clear: there is no perfect data architecture. Not one that fits all organizations and not even one that will fit your organization throughout time as it evolves or potentially across all needs in your organization, so look to what fits your organization at the moment. And it's okay to take pieces from multiple approaches and try them to see if they fit as a cohesive strategy. But do make sure to not just pick the easiest or most fun parts from multiple strategies - cohesion is crucial.For Mahmoud, a key mentality shift in doing decentralized data, especially data mesh, has been around what "done" looks like. When you think about a physical goods-type product, it's not done once it goes to production. With IT-run data, it was typically that project mentality and "done" was when you delivered the data and moved on. It is crucial to learn how to do actual product management, not just software product management, to understand how to do data as a product right.A key learning from Mahmoud and team, which echoes something Jesse Anderson mentioned, is trying to keep as much of the work done around data inside the domain. There is obviously need for cross domain boundary communication and collaboration but there is a big cost to crossing domain boundaries when doing data work.While some people think we should decentralize everything we can in data - Scott calls those people simply "wrong" -, Mahmoud and team found there to be a significant cost to decentralizing the wrong things. They have a centralized governance layer to make things easier on data product producers and consumers. And trying to do fully decentralized MDM (master data management) can quickly lead to duplication of effort and data. Omar Khawaja mentioned similar issues early in Roche's journey.So, how did ABN tackle these data duplication challenges? Per Mahmoud, they created DIAL or their Data Integrational Access Layer. This is similar to a service bus on the operational plane with the DIAL layer handling data quality checking, business metadata, technical metadata, checking against an interoperable data format, etc. Another instance of a centrally managed service that is leveraged by decentralized teams.Similar to a number of other organizations, Mahmoud discussed how ABN AMRO is creating an internal data marketplace as the mechanism for centralized data discovery and consumption. This way, there is a standard user experience when looking for and trying to understand what data is available. A standardized experience is crucial to really drive data consumption. The marketplace requirements also lead to a very transparent way to share data.Per Mahmoud, ABN is also working on making the data integration experience standardized in a few different ways. The previously mentioned DIAL layer is a centralized way to do integrations, whether that is creating new data sets that are reused across multiple downstream data products or integrating in more of a virtualized, on-the-fly way. If you aren't careful, it is pretty easy - especially if there are domains that might naturally touch on similar concepts - to duplicate work, which can cost A LOT. Especially because most data tool licensing isn't designed for doing decentralized data.As part of or similar to the marketplace concept, Mahmoud talked about how ABN is creating integration recipes. So while recipes may not be a data product, these repeated integrations may be similar to a downstream data product in how they present to data consumers. And other consumers can leverage the same recipe or clone it and adapt it to their needs. It has been very important to share what recipes others are using to drive insight sharing across domains.To help manage compliance/governance and also to make sure data consumers understand what they are actually consuming, the DIAL layer prevents people from doing data integration without consent from data producers. Ust Oldfield mentioned something similar regarding how self-serve without understanding by data consumers can cause major issues.Mahmoud and Scott discussed how different just creating data products and data as a product thinking are. If you are really thinking of your data as a product, versioning and the actual data product interface are crucial. And with versioning, it's important to know who will be impacted by a change when assessing if and how a change should happen.One thing Mahmoud would do differently is focusing more on encouraging/enabling data consumption earlier in the journey. While consumption is picking up, it still is below desired levels and is behind how mature they are with getting data on to the platform to share. Part of the reason for lower than desired consumption came from leaving the focus on data integration until later in their journey. They are trying to find - or if not find, then develop - better standards to make data integration easier. While there are some standards for metadata like OpenMetadata, it's still early days.Lastly, Mahmoud mentioned how their metadata was just getting to be in too many places so they are building out a metadata lake - a tool-agnostic lake for their metadata. It remains to be seen if this is a common pattern in data mesh but it may address one of Scott's big concerns - the "trapped metadata" problem.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 18, 2022 • 1h 12min
#102 Share Data by Default and Other Stories/Advice from Leboncoin's Data Mesh Journey So Far - Interview w/ Stéphanie Bergamo and Simon Maurin
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center hereStéphanie BergamoLinkedIn: https://www.linkedin.com/in/st%C3%A9phanie-baltus/Twitter: @steph_baltus / https://twitter.com/steph_baltusSimon MaurinLinkedIn: https://www.linkedin.com/in/simon-maurin-369471b8/Twitter: @MaurinSimon / https://twitter.com/MaurinSimonIn this episode, Scott interviewed Stéphanie Bergamo and Simon Maurin of Leboncoin. Stéphanie is a Lead Data Engineer and Simon is a Lead Architect at Leboncoin. From here on, S&S will refer to Stéphanie and Simon.Some key takeaways/thoughts from Stéphanie and Simon's point of view:"Bet on curious people", "just have people talk to each other", and "lower the cognitive costs of using the tooling" - if you can do that, you'll raise your chance of success with your data mesh implementation.Leboncoin requires teams to share information on the enterprise service bus that might not be directly useful to the originating domain on the operational plane. They are using a similar approach with data for data mesh - sharing information that might not be useful directly to the originating domain by default.Leboncoin presses teams to get data requests to other teams early so they can prioritize it. There isn't an expectation of producing new data very quickly after a new request, which is probably a healthy approach to data work/collaboration.Embedding a data engineer into a domain doesn't make everything easy, it's not magic. Software engineers will still need a lot of training and help to really understand data engineering practices. Tooling and frameworks can only go so far. Be prepared for friction.Similarly, getting data engineers to realize that data engineering is just software engineering but for data - and to actually treat it as such - might be even harder.Software engineers generally don't know how to write good tests relative to data. Neither do data engineers. But testing is possibly more important in data than in software. We all need to get better at data testing.Start with building the self-service platform to solve the challenges of the data producers first. You may make it very easy to discover and consume data but if the producers aren't producing any data...If your software engineers are doing data pipelines at all before starting to work with them in a data mesh implementation, you can probably expect they aren't using best practices.It's pretty common for good/best practices to be known by only a few people inside an organization, such as with a specialty-focused guild. Look for ways to cross-pollinate information so more people are at least aware of best practices if not able to fully implement them yet.Trying to force people to share data in a data mesh fashion didn't work for Leboncoin and probably won't in most organizations. Find curious developers and help them accomplish something with data, that will drive buy-in.As part of #10, data products often start as something serving the producing domain and then evolve to serve additional use cases. They start by serving a specific business need and evolve from there.Look to build your tooling to enforce your data governance requirements/needs. Trying to put too much on the plate of software engineers probably won't go well.Around the time Zhamak's first post on data mesh came out in mid 2019, Leboncoin was experiencing many of the pain points Zhamak laid out quite clearly in her article. Their teams were already organized in the "Spotify model" so data ownership was already distributed to many of the domains. But, they were seeing increasing time-to-market - often hitting what Simon called "very long" - for new data initiatives. They already had an organizational model and some ways of working that fit well with data mesh so they decided to give it a try.So, per S&S, they tried using the data mesh principles for a first use case - building out their recommendation engine. It was a greenfield initiative so it was a good one to test out how well data mesh could work for incremental data needs.In order to proceed with the pilot, S&S and the rest of the data team had to negotiate with the CTO. Once the pilot was successful, they started embedding data engineers into the teams with the most obvious needs while starting to build out the self-service platform. They already had their CI/CD platform for the operational side so they adapted it to also work with data products. And then they added additional data processing requirements, the governance, etc. to make it as self-service as possible for data producing teams.The good news, per S&S, was immediate traction with the self-serve platform with the back-end engineers. But they were still suffering from the distance between the data and the software engineering people/capabilities. It was difficult to get the software engineers to see data engineering as a type of software engineering, and many of the data engineers also had a hard time seeing data engineering as a subset of software engineering.This is a common complaint from many organizations - just because you embed data engineers into domains, that doesn't mean everything becomes easy, you still need to get the software engineers/developers to understand and care about data and data engineering practices; and the data engineers need to learn more about software engineering to best collaborate with the software engineers.Data pipelines were a major blind spot for a number of the software engineers according to S&S. If the software engineers were doing pipelines, most were not doing them that well with a number of not-so-great practices to put it nicely. So there was a focus on communicating why data pipelines are so crucial to the overall company and how software engineers can learn to do them better. Data mesh can help to facilitate sharing that vision and giving the software engineers ownership over data got them excited in many cases.S&S are reevaluating if their current internal guild setup is really working with a data mesh approach. It is currently organized only by specialty and that means there isn't a lot of cross-pollination of information - people outside the specific guild don't have easy access to learning new best practices shared with members of that guild. Tim Tischler mentioned the idea of broad group show-and-tells / info sessions around data products that may help with these challenges if done around data practices.This lack of broader informational best practice sharing is biting S&S and Leboncoin in the behind, especially around testing. While software engineers know how to write really good software tests, most data engineers aren't as good at writing tests and software engineers aren't good in general at writing data-specific tests. But testing is really crucial to be confident in future changes - if you don't know what will happen with a change, that's a bad spot to be.On driving buy-in, S&S shared that trying to force people along the path to sharing their data just didn't work well. What they found that worked was finding the curious developers and helping them accomplish what they wanted with data. And finding actual projects that can add value, ones with specific use cases - often ones that are directly useful to that domain itself first.At Leboncoin, many of their data products start off serving the producing domain and then the domain lets others know they've created potentially useful data. This is similar to what Leboncoin does on the microservices side as well with teams often consuming their own events from the enterprise service bus. So the first step for a data product is to build to explicit business needs and then see if additional business value comes from the data product.Per S&S, another thing that's been helpful is their roadmap process - teams should tell other teams what they will need from them early. If you have need for data, you need to communicate it early so other teams can prioritize it. There isn't an expectation of immediately producing data, which is a healthy way to collaborate.Leboncoin has an interesting approach to sharing information. On the operational plane, as mentioned earlier, they have a enterprise service bus and teams are supposed to share information that might not be explicitly useful to them - they are asked to consider what might be useful for other teams and to share that at the start of the development process - so there isn't a request to add it in later, it was added from the start! They are doing the same approach on the data side with data mesh. It might not be in data products with strong SLAs but other domains can at least understand what data could be formed into data products.S&S recommend that when you start building out your federated governance, really start with following the pain. Put data engineers and back-end engineers in the same room to find out what's actually necessary to do and what should be built into the platform. If you can make the tooling enforce governance requirements/needs, that's easier for pretty much all parties.S&S finished the conversation with a few quick quotes: "bet on curious people", "just have people talk to each other", and "lower the cognitive costs of using the tooling" - if you can do that, you'll raise your chance of success with your data mesh implementation.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 17, 2022 • 27min
Weekly Episode Summaries and Programming Notes - Week of July 17, 2022 - Data Mesh Radio
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 15, 2022 • 1h 19min
#101 H&M's Data Mesh Journey So Far Including Finding Reusability in Interesting Places - Interview w/ Erik Herou
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Erik's LinkedIn: https://www.linkedin.com/in/erikherou/H&M Career page: https://career.hm.com/In this episode, Scott interviewed Erik Herou, Lead Engineer of the Data Platform at H&M. To be clear, Erik was only representing his own views and perspectives.A few key thoughts/takeaways from Eric's point of view:Data mesh can work well with a product-centric organization strategy as both look to put ownership and product thinking in the hands of the domains.To develop a good data/enablement platform for data mesh, look to work with a number of different types of teams. That way, you can see the persistent/reusable patterns and capabilities to find ways to reduce friction for future data product development/deployment.H&M had an existing cloud data lake that was/is working relatively well for existing use cases. But the team knew it likely wouldn't be able to handle where they wanted to go with many more teams producing data products of much higher quality and potentially sophistication.When implementing data mesh - or any data initiative really - it is easy to fall into the trap of doing things the same way you did before. The "old way" feels safe and it was/is still working relatively well for H&M. So they treated their data mesh implementation as almost a greenfield deploy.Because of the long-term focus on making it low friction and scalable to share data - the consumers will come as you make them more data literate - most of the early data/enablement platform work has been focused on helping data producers. A common pattern in data mesh but your constraints and needs may not match.Erik's team is focused on enabling data producers first specifically so his team doesn't become a bottleneck. It is easy for a platform team doing any part of the individual work to become that bottleneck.Consider how much organizational change you require before starting to create mesh data products. H&M did a large amount of that organizational change, other companies start in their current structure and evolve as they learn more. Both are valid and can work well.Specific to H&M, a strong track record of good return on investment in AI meant there was less pushback than in many organizations when they started driving buy-in for implementing data mesh.In the historical data warehouse world, there was less need for data literacy because most people were pushed reports but also couldn't do much, thus not "getting themselves in trouble". If we move to a more self-serve approach, that means we need much better data literacy - it can be a big risk to allow access without understanding. Otherwise, it could be like turning a six year old loose in a fully stocked kitchen where they intend to "make dinner".Data catalogs could really help push forward general data practices but we still need to have actual conversations too. Being able to ask someone about what data means and similar high context exchanges are crucial."If you have a complicated business, you have complicated data."If your mesh data products don't maintain loose coupling, your data mesh implementation is probably headed for troubled territory. It's one of the key tenets of Zhamak's concept of a data product/quantum, to be architecturally independent.Input ports are an easily overlooked place to find reuse. Many teams need the same type or style of processing from similar source systems. Having standard input ports can significantly help reduce the complications around building data product ingest mechanisms.About 3 years ago, when Zhamak's first data mesh article was published (May 20, 2019), H&M was reorganizing to be a product-centric organization; data mesh dovetailed nicely with that strategy - they were moving away from IT as a service-oriented organization. Erik and team knew that with a move to a product-centric approach, teams would need to be and would become data savvy and "data intense". With their existing setup and knowledge, many teams would not be able to meet the new requirements because while H&M's early AI investments were paying off, many teams just weren't ready to do that complicated of data work. To scale their org-wide data capabilities, they would need something like data mesh as the teams doing AI were the very mature teams - maturing ~200 teams to that level would be essentially impossible. Especially when you think about getting to self-serve data producing and consuming as necessary to scaling ways of working.The management team at H&M were bought in to the product-centric reorganization so overall, it was not too difficult to drive buy-in for implementing data mesh at the same time, per Erik. There was buy-in and interest in participating from all types of teams from the pure data producer to pure consumer and everywhere in-between. There were a number of teams with the capabilities and resources to participate.As part of the platform/core enablement team for data mesh, Erik saw how helpful it was to work with multiple types of teams serving different needs. Because they worked on multiple pilots across a range of teams with differing capabilities as well as needs, they were better able to identify reusable parts of the data product development/deployment/management process to add to what the platform team offered.Erik and team had a leg up on many other organizations considering data mesh: a data platform that was working well and serving current customer demands. Erik called the data consumers "happy enough" with their existing cloud data lake as they could mostly do what they needed to do. But the data team also knew that their existing cloud data lake would not scale to what they needed in the mid- to long-term as it would not likely be able to handle ~200 teams all producing data products. A key benefit of this existing well-functioning solution was there wasn't a rush to get a replacement in place.H&M's approach to building out their enablement/data platform was almost a greenfield approach, per Erik. He said it is easy to fall into similar patterns of what you've done in the past, especially since their existing solution was already working. But they knew they had to stay away from the gravity of what they'd done before and look to new ways of working. But again, they had the time to do it right and think about the initial stages as a bridging solution, not a rip and replace. And thus far, it is working well.To date, the main focus of H&M's data/enablement platform team has been building the self-service capabilities for the data producers. There is a large pool of highly data literate data consumers already, especially the teams mentioned above that are advanced in applying AI. So these initial stages are about testing so they can discover the ways to make it easy on data producers to create and manage data products. Most of the initial data products are source-aligned, generic data products not tailored to any specific use case.The mid-term data/enablement platform strategy focuses on iteration and learning patterns. Erik and team know they won't get it all right upfront, so making sure people understand there will be iteration and evolution is key to keeping people bought in to the long-term, big picture vision. That's where they plan to really focus on making the platform as easy as possible for consumers as well.Erik shared the big reason for focusing on building the enabling capabilities into the platform rather than the data processing or other capabilities. First, they already have a good platform that can do the data processing :D But also by not taking on any of the work themselves and finding ways to reduce friction, they can stay away from becoming a bottleneck and make it easier for more teams to participate. It is easy to get dragged into specific work.Per Erik, as many guests have said, data mesh is very much of an organizational-focused effort. The technology and architecture side aren't easy but to have a successful implementation, more effort will need to be spent on the organizational aspects. H&M was inspired by what Spotify has done with their organizational approach, leading to their return to the previously mentioned product-centric thinking/approach. One interesting point is that Erik believes you need to implement at least a decent amount of your organizational change at the start of your journey or teams will struggle to deliver mesh data products.So why did H&M not have much pushback, why were so many teams including data producers bought in to participating in the data mesh implementation? Per Erik, H&M has had a good track record of driving strong returns on their early investments in AI, especially around driving business optimizations. But overall, people understood that the current AI setup would not scale to a wider audience. So they've seen strong returns from doing data well and trust the data leadership to deliver further.Erik made the interesting point that in the data warehouse world, most data consumers were plenty data literate relative to needs - but that was because they were fed the reports directly with no real push to be inquisitive. Everything was also controlled so there was a good data quality filter. Once you open up to self-serve consumption, that can cause issues.The big issues Erik has seen with allowing self-serve access without proper training / data literacy efforts are mostly around data misuse. Not unethical or inappropriate use but simply misunderstanding what the data means and which data to use to answer important questions. But he hopes that their data mesh implementation will guide people to the right information, especially by providing the right contacts to get more information. Per Erik, many many people in data are putting a lot of hope in where data catalogs are headed. But the data catalogs should not be the only way people learn about what data is available or what said available data actually means. Conversations about data are valuable - and they can be fun! A good example Erik gave was if people are asking a lot of unexpected or possibly strange questions about your data product, it might be a signal you should re-engineer it. Erik and Scott agreed that part of where data mesh approaches things so differently is the emphasis on loose coupling between data products. Coupling in data has made it extremely difficult to make changes historically so we need to prevent that BUT still make data interoperable. Otherwise it's just high quality data silos. But not every data product needs to interoperate with every other data product. And there also needs to be different types of data serving based on consumer needs so data products will need multiple output ports. In wrapping up, Erik shared the specific types of patterns and practices the data/enablement platform team is working on. Schemas and generally schema handling, sensible defaults, input ports, etc. The input port example was really interesting and enlightening - Scott hadn't heard that example in 80+ interviews.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf