Data Mesh Radio

Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh. Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out! Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above. If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/ You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/ Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio

Latest episodes

Aug 1, 2022 • 1h 21min

#107 Focusing on Outcomes and Building Brave Teams in Data - Interview w/ Gretchen Moran

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Gretchen's LinkedIn: https://www.linkedin.com/in/gretchenmoran/NGS' current openings: https://ngs.wd1.myworkdayjobs.com/ngs_external_career_siteIn this episode, Scott interviewed Gretchen Moran, the Senior Director, Data Products at the National Geographic Society (NGS; the non-profit arm of National Geographic).Some key takeaways/thoughts from Gretchen's point of view:NGS is a bit unique in that they don't have a widely deployed data architecture so they do not have a lot of habits to unlearn. Starting with a greenfield means likely more training and learning/experimenting will be required but at least no institutional unlearning.To move forward with data mesh, organizations must be able to embrace change - and the pain that it will inevitably bring - and embrace ambiguity. You need to move forward and figure it out together but also be okay with failure as a learning experience as you test what works for your organization.To win the hearts and minds of data producers, show them what high-quality data can mean for the organization and their domain/role. Work closely with them, understand their context, hold their hand to bring them along and align them to the vision of data mesh.It's easier to drive buy-in widely if you find the organizational influencers and win them over. It is the domino effect in practice. Partner closely with the influencers early on to drive your initiative forward.For NGS, they are working with a single initial data producing team for their proof of value. The data mesh world seems to be split a bit between working with one or two to three teams in the initial proof of value stage."Any technology effort is still a people effort."We have yet to learn how to leverage the knowledge and context of people without data knowledge in general in the data and analytics space. This is what data mesh tries to unlock but we are still figuring out how to do it well.It's very easy to intimidate people with data. We need to make tech and especially data much less intimidating to push broader adoption. The business context of those who aren't yet data literate can be extremely valuable. We need to lower the actual bar to leveraging data but also lower the perceived bar to leveraging data."Metrics + outcomes = value" - without outcomes attached, metrics have no value.Automation is going to be key to many aspects of data mesh. Upskilling people to leverage data will only really pay off if it doesn't mean a large increase in the amount of work to leverage data.User experience is crucial to getting the most value out of your data. Think about your data user experience (DUX) and bring in designers to help optimize the experience and really focus on data as a product thinking.NGS is still trying to find who should own generating and sharing insights on data combined from multiple domains. Is that a centralized insights team? Does that push us too far back towards centralization? It's still early days but those insights are crucial to driving value from data.We will see where new insights come from in data mesh. Will it be more insights from data consumers as they can spend time on data analysis instead of data cleaning? Or will it be the data producers as they learn how to really leverage their own data? Both?Many are waiting for vendors to validate data mesh but they really haven't done that very well yet. It remains to be seen if they even could validate something so complex and large in scope.Building brave teams - teams that aren't afraid of new challenges or of failure or especially of ambiguity - will be crucial to getting data mesh right. Teams might be afraid of doing things "wrong" but as long as they work to get to a good state and incorporate feedback and learnings, that's what will drive much more value in the long-run.There is really shared ownership of data even inside the same domain. The subject matter experts and the people shaping the data products must build a strong relationship with good communication.Data mesh aims to - and needs to - solve for the cost of change in data. Traditional data warehouses have had an extremely high cost of change. The huge cascading pipeline setups most have with data lake too. Data mesh needs to make evolution in data much easier, quicker, safer, less costly, etc. It's still early days there.There is a delicate balance between over architecting and underinvesting in your platform. Look to build for reuse and don't lock yourself into decisions where possible. Far easier said than done.Many teams are worrying if they are doing data sharing wrong. But can they actually really do it wrong? Yes, probably, but if they are open to feedback and paying attention, they don't have to get it "right" the first time. You can evolve to get to a very good place but prior data setups have been so rigid, that evolution has been tough and costly.You probably don't need to build out as large of a team as you might think to start on your data mesh journey - depending on your timeline. It will take years to get to delivering fully on your vision but you can add a lot of value as you progress and learn - you don't have to get it perfect at the start! And consider what skillsets are really crucial.Gretchen started off by giving her background and some of the ways her history has played into her perspective and current role. A big factor in her interest in data mesh was helping a number of large organizations evolve their data platforms and how that helped those organizations deliver better results - but they still often struggled somewhat to derive full value from their data and data mesh can hopefully unlock that value.For the National Geographic Society relative to data mesh, preparation has been Gretchen and team's keyword. Rather than trying to move forward with their data mesh implementation as fast as possible, they've spent the last year testing and preparing for implementing their data strategy. And they are in a bit of a unique situation because even though the organization is over 100 years old, really their tech stack is about seven years old and they don't have a cohesive data architecture deployed. This means they also don't have a lot to unlearn but have a ton to learn and experiment on.NGS is already organized in a product-centric approach and the technologists really understand their domains. Now, Gretchen and team just need to get them bought in that they should treat their data like they do their applications - like a product - as they move forward with their data mesh implementation. Easier said than done but the organization in general hasn't been pushing back on these ideas, which has meant good initial collaboration.While embracing sharing data is crucial to NGS' overall organization-wide strategy, it's not all sunshine and rainbows as Gretchen knows there will be a lot of heavy lifting. Heavy lifting around change, heavy lifting in going against the status quo. To be successful with a data mesh implementation, the organization is going to have to embrace change and ambiguity. And both are typically painful.Gretchen and team knew to lay the groundwork for something like data mesh, their organization needed a base layer of data literacy. Without an understanding of data, would they even have people to consume data, much less people capable and willing to produce their data like a product? So they started by bringing in consultants to help people start learning and to build a general business glossary.But to really reach the "hearts and minds" of the general organization, Gretchen knew they needed to show people what value data can bring them. What are their goals and needs and how can data support that? How is something like having high-quality data available valuable to data consumers but more tricky, how is it also valuable to data producers? And part of their data literacy/upskilling process was showing people what using data could mean for them, not just a training course in SQL or Tableau. Just training people how to use a piece of technology in a vacuum hasn't worked well. A success vector for Gretchen has been finding the organizational "influencers" that provide the leverage to drive buy-in across the org.So, how are Gretchen and team getting going after their preparation period? They are partnering very closely with an initial pilot team and are going to prove out the value to share more broadly across the organization. This has been an interesting question in data mesh - how many domains and/or data products should be involved in your proof of value?Broadly speaking, Gretchen believes - and Scott agreed - any technology effort is still very much a people effort. It is very hard to do something like make data self-evident so the people need to steer and steward any technology effort to do something like that forward. Then add in the fact that data mesh is much more organizational/process focused and the people side becomes even more crucial.Gretchen talked about metrics in general and her theory of the "bad metrics sin" - that it is worse to have bad metrics than no metrics at all. And to identify early and then stay away from vanity metrics. She strongly believes that metrics + outcomes = value so without the outcomes, metrics don't have value. As Sadie Martin and Katie Bauer mentioned in their episodes, measure what matters and measure what you will act on. And measuring impact in the NGO (non-governmental organization; a term for many non-profits) space is particularly difficult - Gretchen used the word persnickety - so really finding your useful metrics and backing them up can be a challenge but is crucial.One behavioral change Gretchen is pushing heavily as people are learning more and more about data - asking about people's rationales when making a choice working with data. The outcome is more context for all involved because people use their context to make choices so learning why they made choices can highlight very interesting points. Why did they go with X versus Y? It's crucial to do this to enhance curiosity and learning rather than asking people to prove their reasoning/understanding. So ask with the tone and goal of "tell me more".And it's easy and quite common to intimidate people with data, per Gretchen. We need to lower the actual bar to leveraging data but even moreso, we need to lower the perceived bar of how challenging it is to leverage data. Part of doing that is meeting people where they are, showing them how they can leverage their current knowledge and skills while upskilling them to be even more effective.Gretchen is seeing people in NGS going through so many hoops to produce reports and data in very manual ways; so enabling them to do produce and consume data automatically and more reliably is something she's excited to take on. That way, NGS can leverage their knowledge and skills without the manual effort - allowing them to focus on the value-add aspect of working with data, the insights and how to act on them.User experience (UX) is really crucial to everything NGS does, per Gretchen. Their product managers spend a lot of time to really understand the business aspect of things, not just the software pieces. So they now need to learn how to do the same with data. Product thinking is crucial to getting data mesh right, not just creating data products. How can we move to sharing actual insights instead of just data? And especially, who owns creating and sharing insights on data combined from multiple domains?For Gretchen, it will also be interesting to see what additional insights can be generated when we focus on keeping data clean from the start, not just cleaning it up after the fact by data consumers. What additional insights might come from people actively monitoring the collection and processing of information? Who will generate the new insights? Will it be the traditional data consumers who can now spend the time to work with the data instead of clean it? Or will more insights flow from the data producers as they really get their arms around their own data? The answer is probably both.Building brave teams - teams that aren't afraid of new challenges or of failure or especially of ambiguity - will be crucial to getting data mesh right in Gretchen's view. People have to welcome change and understand that while change is painful, there is a point and purpose for it. Give them the understanding of what the change is for, what is the reasoning.Gretchen and team are trying to ensure they aren't over architecting the data platform, putting in too much work too early and locking themselves into choices if there isn't a need. But then it is quite easy to underinvest and not provide what people actually need. So she's really focused on making the platform robust enough but not too rigid or expensive. It's a hard needle to thread.Many teams are worrying if they are doing data sharing wrong in NGS, per Gretchen. But can they actually really do it wrong? Yeah probably, but if they are open to feedback and paying attention, they don't have to get it "right" the first time to get it right eventually. You can evolve to get to a very good place - prior data setups have been so rigid where change has been extremely painful so that evolution has been tough. Data mesh needs to solve for lowering the cost and fear of change in data but it's still early days.Gretchen doesn't think you need to build out a huge team to do data mesh, or at least to get moving. Her team's approach is to build a reusable base for generating and managing mesh data products and have a few data architects to keep moving things in the right direction. Then, they have the team and the drive to teach developers how to manage data as a product and get them bought in that it's necessary to do so.Some rapid-fire insights from Gretchen to wrap up:We have yet to learn how to leverage the knowledge and context of people without data knowledge in general in the data and analytics space. This is what data mesh tries to unlock but we are still figuring it out.There are good incentives for teams to produce high quality and reliable data but you have to work with them closely to explain it.The concept of data lake was to only invest in cleaning and maintaining the data when there was a clear use-case, a clear reason to invest that time. But it was clean-up, not proactive cleaning, and typically had opaque and/or mediocre ownership - that made it much harder to derive the value.Vendors have yet to really validate data mesh and that means many folks are still sitting on the sidelines. It will be interesting to see if vendors really can ever validate it given how complex and large in scope data mesh really is.To do data mesh right, many stakeholders need to parse the principles - or at least what the principles are trying to achieve - and then, crucially, adapt them to your culture. Data mesh can't be about cutting and pasting from someone else's implementation.Shared ownership of data is very hard. That seems obvious but even within the domain, there is shared ownership between the subject matter experts and those shaping the data to share externally. There needs to be strong communication and a good relationship between those parties.Really spend the time to consider what skillsets you actually need. And when you will need them. It's okay to have more basic data products in the early days of a mesh implementation as developers learn how to work with data properly.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 31, 2022 • 29min

Weekly Episode Summaries and Programming Notes - Week of July 31, 2022 - Data Mesh Radio

Jul 29, 2022 • 1h 8min

#106 Building an Effective Data Strategy: Why oh Why Don't You Start with the Why - Interview w/ Liz Henderson

Jul 26, 2022 • 9min

#105 Data Modeling in Data Mesh Part 1 - Mesh Musings 23

Jul 25, 2022 • 1h 4min

#104 How Does Data Mesh Impact the Business: Learnings from T-Mobile Polska's Early Journey - Interview w/ Karolina Henzel

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Karolina's LinkedIn: https://www.linkedin.com/in/karolina-henzel/In this episode, Scott interviewed Karolina Henzel, Data Enablement Tribe Lead at T-Mobile Polska. FYI, businesses and domains are fundamentally similar in this conversation and are used essentially interchangeably.Some high-level takeaways/thoughts/summarizations from Karolina's view:Business transformation and impact is what really matters. Digital transformation is just a mechanism to transform your business into being more digital native and focused. Data transformation is just a part of digital transformation. Transformation should all come down to driving positive business impact.To drive something like data mesh forward, you really need top management support, likely a C-level executive sponsor. Otherwise, it is very easy for work to get deprioritized and pushed out.Don't take on a large-scale data initiative unless there are specific business challenges to address. Don't do data mesh for the sake of "being data driven"; what are the issues and why will addressing them help your business? Explicitly define the problems and the pain points.To drive change, look for "change agents" in the domains. They are people with the will and capabilities to drive large-scale change. They aren't always easy to find but once you start to identify them, patterns will emerge.The big pain points T-Mobile Polska was facing were: 1) poor/inconsistent data quality; 2) data discovery difficulties; and 3) slow time-to-market for new data and insights.T-Mobile Polska was able to move forward with data mesh because business representatives in the domains were bought in that addressing the data pain points would drive incremental business value - there would be a return on the data work investment.Look for quick wins and how to deliver continuous incremental value. If it is all about producing a big bang, you will very likely lose momentum, prioritization, and funding. Continuous value delivery is crucial to keeping people excited about data work.T-Mobile Polska's data quality issues were caused mostly by a lack of accountability/ownership and not adhering to standard definitions across domains and reports.Lack of standard definitions - or at least very clearly differentiated definitions - can cause numbers to not match across reports. And that makes people not trust the data. So drive domains to clearly define terms and, if possible, look to create standard definitions across the organization.Find KPIs that are focused on what actually impacts the business. Dave Colls mentioned fitness functions to break down measuring progress against big challenges into smaller measurements. Karolina and team are making sure the end result is business impact, not technical-only change.You can drive buy-in that data producers should provide high quality data products by showing producers the impact their efforts are having. Can be a chicken and egg issue at first but you can take the results from one domain or business and show them to another to drive buy-in.T-Mobile Polska has a data owner in each domain that has high-level data ownership accountability. There is a data steward in each domain as well that is the subject matter expert. When needed, there is also a technical data steward (embedded data engineer) but it's not necessary in many domains.As a leader in the data governance team at T-Mobile Polska, Karolina has a considerable number of different aspects under her including the data platform, the data warehouse, data quality, etc. Much like data governance as a term has many aspects.Karolina started off the conversation with an important and useful point she emphasized a few times: you need to have specific challenges you want to solve before you embark on something like a data mesh journey. Not just "let's be data driven" - what business challenges are you trying to solve? And look at them from a "why" perspective - why does tackling this challenge matter? Those challenges could be poor data quality, time spent on non-value-added tasks, data discovery issues, etc. Define the problems and the pain points.It's important to understand that digital transformation is about business transformation first and foremost. What is your business trying to achieve? And what are the target/expected business outcomes? More revenue? Cost savings? Etc. You need to define the pain points and what doing something like data mesh will do for the organization to secure cooperation from business leaders. And don't count on patience as you work towards a big value delivery in the future - you need to continuously create "incremental value" along the way.For Karolina, there were three main challenges they needed to address relative to data with their data mesh implementation: 1) data quality was a constant issue - typically stemming from lack of real ownership/accountability and no standard term definitions; 2) data discovery - it was very difficult for data consumers to find data; and 3) time-to-market for new data and insights - the data function was becoming a major bottleneck to the business side.Data governance can't only be an IT problem/challenge, per Karolina. In their data mesh implementation, they are focusing the central governance team on creating the tools and frameworks for the distributed teams to leverage. For instance, the business and technical metadata comes from the domains but the data catalog is offered as part of the platform by the governance team. This separation of duties has allowed quick time to business benefits when bringing on new teams to their data mesh implementation.Karolina and team knew they were facing issues with data so they started interviewing business representatives to ask what were their biggest challenges. The governance team heard repeatedly data quality was an issue but didn't know exactly why they were having data quality issues. So they moved to increase accountability, assigning data owners and data stewards. Collaborating with the owners and stewards, they were able to figure out a few major causes were: a lack of real ownership, no common definitions, no real standard measurement of quality, etc. And addressing those challenges resulted in some quick wins to get positive momentum towards delivering continuous incremental value.At T-Mobile Polska, Karolina has seen how crucial having a C-level sponsor is to succeeding with something like a data mesh implementation. It is very easy to lose prioritization - there is always a more pressing short-term business need than producing high quality data so you need someone that can make sure that data work isn't unreasonably pushed out. Specifically, they created a data governance committee to have strategic supervision of the data governance and data quality efforts and identify the strategic initiatives to continuously deliver incremental value and put things on businesses roadmaps.Scott asked a question he asks many people: what is the reason for creating new mesh data products at T-Mobile Polska? Karolina shared that data products are initially created to serve reporting specifically in most cases. They can expand to serve additional use cases but there is a specific use case in mind for each new mesh data product.Karolina discussed some of the new ways of working and the challenges around the necessary mindset shifts to implement something like data mesh. People were just used to data engineering delivering the data. So producers were used to throwing things over the wall and data consumers were used to making asks to a highly data literate group of people. So, they are inventing new ways of working and processes to not have data engineering handling the communication between teams. Business owners are in charge of explaining why owning and serving their data as a product can add value to their org, what is in it for each person in their own org. One explanation that has resonated well - and been proved out repeatedly - is that by moving to a data mesh way of working, there is a significant reduction in time-to-market for new data and insights including for the producing domain.As part of their data mesh implementation, Karolina and team have been restructuring KPIs to make it possible to measure the impact of the data work they are doing. Their focus is on the impact to the business, not technical focused KPIs. One big goal - with a few proof points thus far - has been a reduction in data work that doesn't add value - reducing the time your data science team spends on things that aren't valuable means they can put more value-add models into production. Another big goal, as previously mentioned, is reducing the time-to-market for new data and insights as many other data mesh implementers are seeing. And Karolina's team is driving buy-in through results by showing data producers how much impact they are having or could have by providing quality data.As for how T-Mobile Polska started their journey, Karolina and team started with laying the foundation for good data governance. They first found the data owners and the data stewards in each business. Then they explained the new responsibilities for those roles and why they were necessary. The data owner is at the Director level, essentially the business or domain owner, and the data steward is more of a subject matter expert. And if there are complicated data needs, that domain needs a technical data steward - an embedded data engineer - as well; but not many domains need a technical data steward. Another thing specifically mentioned was leveraging "change agents", the people with the will and the capabilities to drive large-scale change.Karolina then shared some of the issues they've had with data democratization. Similar to what Ust Oldfield mentioned in his episode, just giving access to data when people don't really understand how to leverage data can do more harm than good. So T-Mobile Polska is pushing the not as data literate people to the data catalog as their only point of interface with data on the mesh; the governance team is focused on enabling producers to create standardized reports and datasets to serve those people. The more technical folks have more options to interface with data with fewer technical guardrails.In wrapping up, Karolina reiterated a few of her main points. 1) Focus everyone on what you are trying to accomplish - what are the priorities? What is the impact to the business? 2) Look to deliver incremental value continuously to build and maintain momentum in your data mesh implementation - without that incremental value, support for your implementation is likely to falter. And 3) C-Level management support is crucial to really drive an initiative like data mesh - without it, your work is likely to get deprioritized and will be continuously pushed out.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 24, 2022 • 27min

Weekly Episode Summaries and Programming Notes - Week of July 24, 2022 - Data Mesh Radio

Jul 22, 2022 • 1h 13min

#103 4 Years of Learnings on Decentralized Data: ABN AMRO's Data Mesh Journey - Interview w/ Mahmoud Yassin

Jul 18, 2022 • 1h 12min

#102 Share Data by Default and Other Stories/Advice from Leboncoin's Data Mesh Journey So Far - Interview w/ Stéphanie Bergamo and Simon Maurin

Jul 17, 2022 • 27min

Weekly Episode Summaries and Programming Notes - Week of July 17, 2022 - Data Mesh Radio

Jul 15, 2022 • 1h 19min

#101 H&M's Data Mesh Journey So Far Including Finding Reusability in Interesting Places - Interview w/ Erik Herou

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Erik's LinkedIn: https://www.linkedin.com/in/erikherou/H&M Career page: https://career.hm.com/In this episode, Scott interviewed Erik Herou, Lead Engineer of the Data Platform at H&M. To be clear, Erik was only representing his own views and perspectives.A few key thoughts/takeaways from Eric's point of view:Data mesh can work well with a product-centric organization strategy as both look to put ownership and product thinking in the hands of the domains.To develop a good data/enablement platform for data mesh, look to work with a number of different types of teams. That way, you can see the persistent/reusable patterns and capabilities to find ways to reduce friction for future data product development/deployment.H&M had an existing cloud data lake that was/is working relatively well for existing use cases. But the team knew it likely wouldn't be able to handle where they wanted to go with many more teams producing data products of much higher quality and potentially sophistication.When implementing data mesh - or any data initiative really - it is easy to fall into the trap of doing things the same way you did before. The "old way" feels safe and it was/is still working relatively well for H&M. So they treated their data mesh implementation as almost a greenfield deploy.Because of the long-term focus on making it low friction and scalable to share data - the consumers will come as you make them more data literate - most of the early data/enablement platform work has been focused on helping data producers. A common pattern in data mesh but your constraints and needs may not match.Erik's team is focused on enabling data producers first specifically so his team doesn't become a bottleneck. It is easy for a platform team doing any part of the individual work to become that bottleneck.Consider how much organizational change you require before starting to create mesh data products. H&M did a large amount of that organizational change, other companies start in their current structure and evolve as they learn more. Both are valid and can work well.Specific to H&M, a strong track record of good return on investment in AI meant there was less pushback than in many organizations when they started driving buy-in for implementing data mesh.In the historical data warehouse world, there was less need for data literacy because most people were pushed reports but also couldn't do much, thus not "getting themselves in trouble". If we move to a more self-serve approach, that means we need much better data literacy - it can be a big risk to allow access without understanding. Otherwise, it could be like turning a six year old loose in a fully stocked kitchen where they intend to "make dinner".Data catalogs could really help push forward general data practices but we still need to have actual conversations too. Being able to ask someone about what data means and similar high context exchanges are crucial."If you have a complicated business, you have complicated data."If your mesh data products don't maintain loose coupling, your data mesh implementation is probably headed for troubled territory. It's one of the key tenets of Zhamak's concept of a data product/quantum, to be architecturally independent.Input ports are an easily overlooked place to find reuse. Many teams need the same type or style of processing from similar source systems. Having standard input ports can significantly help reduce the complications around building data product ingest mechanisms.About 3 years ago, when Zhamak's first data mesh article was published (May 20, 2019), H&M was reorganizing to be a product-centric organization; data mesh dovetailed nicely with that strategy - they were moving away from IT as a service-oriented organization. Erik and team knew that with a move to a product-centric approach, teams would need to be and would become data savvy and "data intense". With their existing setup and knowledge, many teams would not be able to meet the new requirements because while H&M's early AI investments were paying off, many teams just weren't ready to do that complicated of data work. To scale their org-wide data capabilities, they would need something like data mesh as the teams doing AI were the very mature teams - maturing ~200 teams to that level would be essentially impossible. Especially when you think about getting to self-serve data producing and consuming as necessary to scaling ways of working.The management team at H&M were bought in to the product-centric reorganization so overall, it was not too difficult to drive buy-in for implementing data mesh at the same time, per Erik. There was buy-in and interest in participating from all types of teams from the pure data producer to pure consumer and everywhere in-between. There were a number of teams with the capabilities and resources to participate.As part of the platform/core enablement team for data mesh, Erik saw how helpful it was to work with multiple types of teams serving different needs. Because they worked on multiple pilots across a range of teams with differing capabilities as well as needs, they were better able to identify reusable parts of the data product development/deployment/management process to add to what the platform team offered.Erik and team had a leg up on many other organizations considering data mesh: a data platform that was working well and serving current customer demands. Erik called the data consumers "happy enough" with their existing cloud data lake as they could mostly do what they needed to do. But the data team also knew that their existing cloud data lake would not scale to what they needed in the mid- to long-term as it would not likely be able to handle ~200 teams all producing data products. A key benefit of this existing well-functioning solution was there wasn't a rush to get a replacement in place.H&M's approach to building out their enablement/data platform was almost a greenfield approach, per Erik. He said it is easy to fall into similar patterns of what you've done in the past, especially since their existing solution was already working. But they knew they had to stay away from the gravity of what they'd done before and look to new ways of working. But again, they had the time to do it right and think about the initial stages as a bridging solution, not a rip and replace. And thus far, it is working well.To date, the main focus of H&M's data/enablement platform team has been building the self-service capabilities for the data producers. There is a large pool of highly data literate data consumers already, especially the teams mentioned above that are advanced in applying AI. So these initial stages are about testing so they can discover the ways to make it easy on data producers to create and manage data products. Most of the initial data products are source-aligned, generic data products not tailored to any specific use case.The mid-term data/enablement platform strategy focuses on iteration and learning patterns. Erik and team know they won't get it all right upfront, so making sure people understand there will be iteration and evolution is key to keeping people bought in to the long-term, big picture vision. That's where they plan to really focus on making the platform as easy as possible for consumers as well.Erik shared the big reason for focusing on building the enabling capabilities into the platform rather than the data processing or other capabilities. First, they already have a good platform that can do the data processing :D But also by not taking on any of the work themselves and finding ways to reduce friction, they can stay away from becoming a bottleneck and make it easier for more teams to participate. It is easy to get dragged into specific work.Per Erik, as many guests have said, data mesh is very much of an organizational-focused effort. The technology and architecture side aren't easy but to have a successful implementation, more effort will need to be spent on the organizational aspects. H&M was inspired by what Spotify has done with their organizational approach, leading to their return to the previously mentioned product-centric thinking/approach. One interesting point is that Erik believes you need to implement at least a decent amount of your organizational change at the start of your journey or teams will struggle to deliver mesh data products.So why did H&M not have much pushback, why were so many teams including data producers bought in to participating in the data mesh implementation? Per Erik, H&M has had a good track record of driving strong returns on their early investments in AI, especially around driving business optimizations. But overall, people understood that the current AI setup would not scale to a wider audience. So they've seen strong returns from doing data well and trust the data leadership to deliver further.Erik made the interesting point that in the data warehouse world, most data consumers were plenty data literate relative to needs - but that was because they were fed the reports directly with no real push to be inquisitive. Everything was also controlled so there was a good data quality filter. Once you open up to self-serve consumption, that can cause issues.The big issues Erik has seen with allowing self-serve access without proper training / data literacy efforts are mostly around data misuse. Not unethical or inappropriate use but simply misunderstanding what the data means and which data to use to answer important questions. But he hopes that their data mesh implementation will guide people to the right information, especially by providing the right contacts to get more information. Per Erik, many many people in data are putting a lot of hope in where data catalogs are headed. But the data catalogs should not be the only way people learn about what data is available or what said available data actually means. Conversations about data are valuable - and they can be fun! A good example Erik gave was if people are asking a lot of unexpected or possibly strange questions about your data product, it might be a signal you should re-engineer it. Erik and Scott agreed that part of where data mesh approaches things so differently is the emphasis on loose coupling between data products. Coupling in data has made it extremely difficult to make changes historically so we need to prevent that BUT still make data interoperable. Otherwise it's just high quality data silos. But not every data product needs to interoperate with every other data product. And there also needs to be different types of data serving based on consumer needs so data products will need multiple output ports. In wrapping up, Erik shared the specific types of patterns and practices the data/enablement platform team is working on. Schemas and generally schema handling, sensible defaults, input ports, etc. The input port example was really interesting and enlightening - Scott hadn't heard that example in 80+ interviews.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app