Data Mesh Radio cover image

Data Mesh Radio

Latest episodes

undefined
Apr 24, 2023 • 1h 11min

#216 What is Your Part in Doing the Right Thing in Data: Value, Ethics, Literacy, and More - Interview w/ Guy Taylor

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Guy's LinkedIn: https://www.linkedin.com/in/guytaylor/Deep Work by Cal Newport: https://www.youtube.com/watch?v=xJYlhhT7hyEIn this episode, Scott interviewed Guy Taylor, Director of Data Science and Analytics, as well as the Director of Experimentation at Booking.com. To be clear, Guy was only representing his own views on the episode.Some key takeaways/thoughts from Guy's point of view:In general, "people want to do the right thing." Look to reward people that do the good, ethical things as part of their work product.Always be asking "what is my part in this?" Ask for the expectations and try to be clear in your expectations of others.Literacy is not only about an ability to read but also write. So with data literacy/fluency, we need to be able to use data but also create and share it. It's about learning how to share information - not just 1s and 0s - and share it well.There are still major communication gaps between producers and consumers in many cases with data. Part of that is just not getting on the same page, really making sure both sides close that gap. Scott note: as Andrew Pease said, both parties should go more than half way to ensure you've covered everything.If you don't align on expectations, you're far more likely to have a bad time :)Data people need to stop trying to jump to the tooling to address a challenge first. Get the information necessary - what are people trying to accomplish to create value? - then look to how tools can drive towards capturing that value.In tech and especially in data, there is a strong tendency or leaning towards taking action now. Sometimes no action or no action right now is the right answer. Often times, you need more information to make a better decision instead of patching the hole in the bottom of the boat with bread.?Controversial?: The way many organizations are trying to leverage data contracts won't lead to significantly better outcomes. There should be technology handling the interface but consumers still need to speak with producers to align on expectations and share their use cases.?Controversial?: There needs to be more accountability and responsibility on data consumers to actively participate in data work to drive towards the most business value.Ethics in data is always a challenging but interesting issue. Start by creating a set of principles and try to frame potential choices within those principles. Your company ethics will evolve, update your practices as they do. Try to "do the right thing" and encourage others to check in on past decisions to reevaluate too.It's crucial to consider cognitive load - people are most productive when they have time to spend on thinking. Work to give teams, especially those new to owning data, the time to learn, not just the information they should learn.When people know there are expectations on them but they don't know what they are, that's unnecessary cognitive load. Look to make expectations more clear on domains about what data ownership means.On data ownership, be very clear about how far it extends. Data producers should own sharing the information but the consumer has responsibility of ownership too. Scott note: for me, you need to be clear in every relationship but it can be the producer owning the data, the insight, or the insight and the 'so what' - you just have to be very clear!In most organizations, teams are overloaded with work and with cognitive load. You can't just easily pull apart the complexity and fix that overnight. And in a lot of cases, it will be hard to do it at all. Look to prioritize instead of do everything. Saying no can be your ultimate productivity tool.Be like Marie Kondo - 'does this spark value for your organization.' Don't be afraid to shut things down that don't spark value and refurbish your orphaned data generating processes that are valuable.Time for innovation is crucial. It's on leaders to enable their teams to prioritize innovation and experimentation.Always be delivering value. Look to use incremental value delivery methods - how do you break down a big potential project and deliver over time instead of the high risk way of a big project?Guy started off by talking about data literacy and how the analogy of literacy - it's not only the ability to read but also write - carries over to data well. Data literacy or data fluency, it's not just can someone consume data but can they also produce data, can they share information in a way that can be 'read' by others? After all, we aren't trying to share data, we are trying to share information but via data.When Guy embeds people from his data team into domains it "is with the express purpose of doing education, making sure that we are having the conversations around what things mean, what our expectations of those things are." Instead of embedding people into domains to do most of the work, they are focused on helping other people get to a level they can handle far more of the necessary data work. Which is quite often not the deep data work but bridging the communication gaps and getting on the same page. That is especially important for expectations - mismatched expectations is one of the most prevalent and damaging challenges to data work. So Guy is asking data team members to spend a lot of time making sure the producers know how to manage those conversations and drive to what is actually of value instead of what was initially requested. In his experience, there is tendency for data people to try to jump to the tooling to solve issues according to Guy. Going back to expectations, if you try to solve without the expectations setting and leveling conversation, you will likely not deliver what consumers expect. You may see it as solved but they sure don't. That's where you get the dreaded "the data is bad" feedback because there aren't clear metrics and expectations. If you align - and as Ghada Richani mentioned in her episode (#206) stay aligned through collaborative prioritization - then there is a much better chance of delivering value and making all parties happy.A comment Guy made was that there is an over tendency towards action in tech and especially data. People see a problem and they want to jump to trying to fix it instead of getting the necessary information first. And it may be no action is the best answer too. Just because there is pain, that doesn't mean action is necessary immediately.Right now, Guy sees the industry conversations around data contracts and data sharing agreements as slightly naïve: people seem to be thinking this is about data integration between systems instead of data sharing between two parties. And that producers should declare every aspect of what they are producing instead of consumers being part of the conversation. Consumers need to share what they are trying to achieve, how they will use the data, etc. so producers understand the value and what would disrupt that value creation. There needs to be accountability and responsibility falling on consumers too. The contract portion can serve as the technology interface but that doesn't replace the need for conversation.Ethics in data is always going to be an interesting but challenging problem in Guy's book. A good place to start is the social contract aspect: how would this be viewed by society? As an organization, start down the ethics path by creating and agreeing to a set of principles. Create good ways for people to seek and receive useful feedback regarding ethics. And honestly, your company ethics will change and it's important to reevaluate your ethical choices, especially as you learn more - your organization will have made mistakes and that's typically not something to lose sleep over, fix it now and know you’re better. Basically look to "do the right thing."While it can feel good to 'make progress', Guy believes in the Deep Work by Cal Newport type philosophy. People have the greatest impact when they have the time to really think and process. Yet, in today's work world, that is a rarity for anyone. If we are asking teams to really take on data ownership, we have to work to prioritize the time to learn - and that includes processing time. Yes, people learn by doing but not only doing :)Guy talked about trying to clear the space for teams to learn something new, including the impact to the cognitive load capacity of teams, especially when it comes to data ownership on the domains. When people know there are expectations of them and their work but those expectations aren't explicit or clear, that's unnecessary cognitive load - domains need to have crisp and clear expectations - and if the expectations of them by consumers and the data team aren't super clear yet, communicate that. But try to get to more detail to make the implicit very explicit.Another common friction point Guy pointed to is the lack of understanding of your impact on others. He believes again that most people want to "do the right thing" but they don't always know there is even a problem to address. How are your actions impacting downstream data consumers? Again, there is a responsibility on those data consumers to generate the conversation! Sharing that context allows people to do the right thing because they are aware.Currently, most teams in most organizations seem to be overloaded with work and cognitively overloaded in Guy's view. It would be lovely to wave a magic wand and fix that but it's not possible. So, we have to work to pull out the complexity and give our teams the ability to do their best work but high value work has interconnected complexity so you can't take a machete to it and expect good results. Look to prioritize what really matters when you can and break things down into manageable chunks. In talking again about data ownership, Guy believes that producing teams should not have to own too much of the downstream consumption. They should own the transformation and sharing of data after it's been transformed but the consumer should own the insight or metric - they asked for the data so they need to have some responsibilities and accountability too. Scott note: I don't entirely agree it should be that for every use case but I do like a very crisp line of ownership if it works.Guy uses the Marie Kondo approach when looking at orphaned systems or processes - evaluate it and is this "sparking joy", is it creating value. If it is, great, let's get it into a good shape and put it into the right hands ownership wise. Not shove something to someone while it is still in data disrepair but get it functioning well and hand it over. But if something isn't creating value, shut it down. For too long in data, there has been a hesitancy to shut things down because at some point they might create value. Don't fall into that trap.It's important to get to an experimentation - a test, learn, and then iterate - approach in Guy's view; he is the Director of Experimentation after all! A good way to get people to see the value of experimentation is how it provides far better incremental value delivery. Instead of huge projects with big budgets that take years and rarely deliver the value expected, what if instead you took the overall goals and broke it into manageable pieces and delivered value over time as you get closer and closer to the project vision. You'll be more nimble and get a return on investment far quicker.A few tidbits from the end of the conversation:Skunkworks can be a great approach to trying things out and seeing if there is value. Don't try to move the skunkworks directly to production but you can do some fun and useful innovation that way.Good leaders set their teams up to innovate. They prioritize the time to try new things and let their people explore, let them go off the "paved road".Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 23, 2023 • 18min

Weekly Episode Summaries and Programming Notes – Week of April 23, 2023

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
12 snips
Apr 21, 2023 • 52min

#215 Panel: Leading a Data Mesh Implementation - Led by Kim Thies w/ Omar Khawaja, Ferd Scheepers, and Mike Alvarez

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Kim's LinkedIn: https://www.linkedin.com/in/vtkthies/Mike's LinkedIn: https://www.linkedin.com/in/2mikealvarez/Ferd's LinkedIn: https://www.linkedin.com/in/ferdscheepers/Omar's LinkedIn: https://www.linkedin.com/in/kmaomar/In this episode, guest host Kim Thies, Director of Intelligence Automation at PayPal facilitated a discussion with Ferd Scheepers, Chief Information Architect at ING, Mike Alvarez, Former VP of Digital Services at a large healthcare distribution company (guest of episode #236), and Omar Khawaja, Head of Business Intelligence at Roche (guest of episode #96). As per usual, all guests were only reflecting their own views.Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views individually.Before we jump in, I think the main takeaway here would be a data mesh implementation leader's journey can be a lonely one. Find peers and exchange information. You can reach out to me (Scott) but there are also many leaders that want to exchange information with each other. The other is the meaning of journey: it's never done; be prepared to continue to push - it can feel Sisyphean but it's important to keep moving forward and expect to continue to drive buy-in.Scott's Top Takeaways:Everyone sees the 'Instagram photos' version of other organizations' data mesh journeys - it's not the reality. Everyone is struggling with certain aspects of data mesh because if this were easy, people would read Zhamak's book and be done with it. It's just not realistic to expect that, give your leaders (yourself?) a break :)It's incredibly important to understand you will get things 'wrong'. Scratch that, you will get MANY things wrong. But 'wrong' in data doesn't have to mean wrong for good/all time. It's about trying, learning, and iterating to better. Really look into fast fail practices. If you feel the need to get everything right, data mesh is definitely not right for you right now.A champion early adopter is crucial to drive a data mesh implementation. There needs to be someone who will partner with you that is "brave enough to try new things" as Omar said. It's easy to focus on so many aspects of data mesh and forget that the mindset shift and cultural change are the biggest challenges for most organizations. They are the most squishy/least tangible but that doesn't mean they aren't absolutely crucial.Teams need the autonomy and empowerment to move at the right speed for them. The more friction in their data work, the more time - and effort - it takes to deliver value and the world can have moved on. You have to give teams the capability to strike while the iron's hot. That's a good way to drive buy-in: we're going to give you the ability to move at the speed of business.Be very clear on expectations. What do we owe each other? Focus on how data mesh drives value for them but also how it drives value for the organization. What is the output of their work, what value comes from what they did or will do? It will be quite difficult to change this kind of thinking overnight.Similarly, be very clear on responsibilities and be very clear on target outcomes. It's easy to get lost in the work instead of what are we trying to achieve.A good success story doesn't have to be a massive win. Time to delivery is really valuable so showing getting something into MVP in a few weeks will get a number of teams excited. Driving that lower is of significant value - and it's a tangible value for many business partners where they can reasonably share about the impact to their business.Other Important Takeaways (many touch on similar points from different aspects):There's a very good reason why we refer to them as data mesh journeys :) You can't treat this as a project. Think of your physical fitness and health. It's an ongoing journey with plenty of trials and tribulations.It's okay if data mesh is not a fit for your organization. There's no shame in that. Be honest when considering if it's a fit or if you just want to do it because it sounds like a good solution in the abstract.In general, look to collect reasons for past failures prior to doing data mesh. Not to throw others or past ways under the bus but to emphasize that the current ways aren't working as well as we might like. Proof points really help convince others.It's always a challenge to maintain autonomy and empowerment without it leading to silos. There isn't a magic formula. It's something you will have to constantly keep a watch on but you're not the only one.There is a balance between autonomy and interoperability. You'll probably get it 'not great' at first and THAT'S OKAY. Work to find the right balance - or at least an acceptable balance - as you move forward. And that balance will likely shift.The focus of the platform should be on making users' lives simpler so they can actually own the data. It's easy to fall into focusing on the tech but the value is in friction reduction for users and the abstractions you offer them.Be prepared to need to continually drive buy-in. Gravity around historical - legacy? - data management practices is STRONG. You will have people wanting to use an enterprise data warehouse or a data lake setup or not wanting to focus on integration with the rest of the organization. Announcing you are doing data mesh doesn't magically mean everyone is bought in. Similarly, be prepared to explain concepts and terms repeatedly. Data mesh - and data work in general - is not the main focus of most people in your organization. Do your best to have the patience to work with people to drive towards a better understanding.Don't focus overly on terminology - see my unicorn farts theory… - focus on what the change in work means for people and what outcomes you are trying to achieve. Business leaders and users don't care that you are doing data mesh - data mesh is a shared language for the data people working on the implementation. Especially don't try to explicitly define federated computational governance :D As Ferd said, look to "bring governance back to the essence of what we want to do." Don't get overly focused on governing everything instead of the right things. That can mean not taking on really difficult challenges right at the start of your journey by passing on use cases. That's okay, think about your capabilities and honestly assess them. Don't bite off more than you can chew.Incremental progress is the best approach to most (all?) aspects of data mesh rather than trying to take giant leaps.Early in your journey, likely focus more on quick wins that will give you some good internal marketing material - market those successes to build desire for additional domains to join, build the momentum of success. Also, think about limiting chances for failure early. Fast fail in general is crucial but you don't want a full 'fast fail' on your first use case. Many things will not work or go great in a data mesh journey. That's the reality of being on the bleeding edge, being an explorer. It's very important to reflect not just on wins but take the learnings from what didn't work. If you don't reflect on your mistakes, you're far more likely to repeat them.Your early champions and business partners can come from unexpected corners of the business. Look for bravery and a well-defined - and valuable - use case with limited initial scope. Be open to the idea of surprising initial use cases and working with unexpected lines of business/domains.Try to directly tie business value generation to your data mesh work. Understand the value drivers for the organization and prioritize the work based on those value drivers.To get buy-in, you don't have to offer domains the perfect solution, just one that is a better option than whatever else is available internally for their use case.Another good potential hook for potential business partners is that data mesh - specifically handling your data as a product - has less technical debt so it can quickly pay for itself in saved time especially for use cases that are mandatory like regulatory compliance.Be very diligent in putting teams together, especially early - throwing bodies at the problem won't solve anything but not having the right capabilities in the teams sets those teams up for failure. I recommend reading Zhamak's book on this topic re necessary capabilities in teams.Incentivization to become a data producer can be quite hard but it's extremely important to get right. How can we think of value creation through data without the perception of monetization? Because that can feel like you are selling data externally.For your first use case, you need to have the business and technology/data team working as one team - one _actual_ team, not just a team in name. That doesn't mean a huge re-org but you need people constantly sharing context and partnering to drive to value. Otherwise, you have a high risk of falling back to the same old ways of doing things.Make failure not a catastrophe. As Vanya Seth says, "contain your blast radius."It's crucial for people to understand that doing decentralized data means a lot of changes. Get people comfortable - or at least acutely aware - that we'll need new ways of working and delivering value.It's probably easier in most organizations to get funding/buy-in if you focus deeply on the business use case. Yes, building data mesh capabilities has an overhead to the initial use case or two as you spend time and effort learning how to do data mesh and it's okay to talk about that. But if you aren't talking direct, tangible business value, getting buy-in - and more importantly funding - will likely be harder.Make sure you have buy-in from consumers on use cases. If they aren't bought in, will they even consume? If you don't have users, the best product in the world is useless.Budgetary issues - why would one domain create value for another if they aren't getting compensated for it? - are extremely common. There isn't an easy answer here unfortunately. As stated above, in general, it's often lonely being the person leading a data mesh journey. Who do you talk to? There's me (Scott) but find other leaders and network.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 17, 2023 • 1h 3min

#214 Is Core Data the Way to Achieve MDM's Goals in Data Mesh - Interview w/ Marcie Stoetzel

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Marcie's LinkedIn: https://www.linkedin.com/in/marcie-stoetzel/DGIQ Conference (Marcie and Samia Rahman will be speaking in June; in San Diego, CA, USA): https://dgiq2023west.dataversity.net/registration-welcome.cfmSquare hole video Scott mentioned: https://www.youtube.com/watch?v=cvwN_O5ypTEArticle about the "leaf sheep" sea slug Marcie mentioned: https://www.bbc.com/travel/article/20210324-the-odd-sea-creature-powered-by-the-sunIn this episode, Scott interviewed Marcie Stoetzel, Principle Product Manager of Enterprise Core Data at Seagen. To be clear, she was only representing her own views on the episode.Before we jump in, Seagen might be a bit of a special case in how much the domains can leverage each other's data around a key record type. There is a lot to learn but you might not be able to find use cases that are as broadly impactful to many domains at once.Some key takeaways/thoughts from Marcie's point of view:?Controversial?: In data mesh, building a data culture focused on engagement, learning, and upskilling might be as important (more?) as doing the data work - if teams aren't willing to engage, what's the point of doing the work? Scott note: Marcie doesn't explicitly say this but it sure feels like an undercurrent. It's crucial to make your data culture something that can embrace data mesh.When going to domains, come with a target value proposition. Why would they get value from participating in your data mesh initiative? Scott note: If there isn't a value prop for the domain, you will almost certainly struggle with incentivization. There are very few 'good Samaritans' willing to do a lot of data work with no discernible direct benefit to that domain.Seeing is believing. Don't just tell people what you are going to do, get demos in front of them and show them more about what you plan to do.Make sure to solicit feedback as you build. It's far easier to pivot after a short period of work than after a longer period. You can find misaligned expectations far earlier and work on appropriate prioritization. Basically, you spend less time building things that don't matter and more time on things that do.Look for ways for domains to share about what information they have that might be useful in the organization. Oftentimes, domains operate so independently, they aren't in the habit of finding ways to collaborate. Scott note: Communication is the easiest path to high-value, cross domain use cases - the mesh won't find them for us.To actually get to cross-domain use cases, you need domains to consider things from another domain's point of view. So extract more information about each domain to literally show to other domains to spark conversations. Be someone that connects the dots for domains and then connect people across domains.Once there is more visibility to cross-domain use cases, business goals can be realigned to focus on building to common goals. That can help with some incentivization and alignment. Scott note: I hadn't previously heard this one articulated but it really does map with goal realignment where many organizations are seeing success with data mesh.When you are early in your journey, look to create a "thin slice of an end-to-end experience for [a] domain." Really don't try to do too much. Scott note: this is what pretty much everyone who has worked on at least two data mesh journeys circles back to. It's VERY important to understand the thin slice concept.Interoperability requires a certain level of trust. You can decide by your use-case how strong that trust has to be. If you need it to be essentially perfect for regulatory reasons, that's far different from the needs of many other use cases.Focus, especially early in your journey, is very useful. Again, thin slice. Don't take on too much while you build initial momentum and buy-in.?Controversial?: Find ways to humanize your data work, add a bit of levity and humor. It will make people connect to it more and internalize it more. Plus, it's just more fun. ?Controversial?: Similarly, "explore, discover and mature data together." It's okay to be vulnerable, especially as we learn. Transparency is also crucial.Community and things like workshops are crucial to see wider adoption and engagement in a data mesh implementation.Consider if you want to use the phrase master data management (MDM) at all. Master has slavery connotations and MDM feels far too heavy as a phrase. It can scare off the exact people you want leaning in to dealing with Enterprise Core Data.Marcie started off with a bit about her background as a teacher and as well on the commercial side of the healthcare space which has shaped her view of teaching/learning and also healthcare data needs. Part of Seagen's goal in using Enterprise Core Data instead of Master Data Management (MDM) as a phrase was the moving away from the connotations with slavery (master) and that this data is core to the enterprise, that this is crucial to the organization, not just a data management practice or task. Enterprise Core Data at Seagen is about creating a way to make data that many domains leverage the same to prevent lots of domains doing the same work and make interoperability FAR easier. MDM also is typically managed centrally instead of enabled centrally and managed at the domain level (federated governance) so you have to rethink a lot to do Enterprise Core Data instead in a data mesh setup. Trying to map 1:1 to MDM in a historical data approach won't work well.At Seagen, Marcie and team decided to tackle one type of core data record first - healthcare professionals - rather than trying to unify every type of record across healthcare - don't boil the ocean or bite off more than you can chew. They are working on creating the platform for domains to manage their core data records which creates more sharing opportunities and even higher quality data - teams can better cross-reference information. While they are still pre-production - it's early days - even bringing this to domains' attention is sparking conversations between domains about potential collaboration and new use cases.Marcie and team are winning converts by showing them what the platform will be able to do for their own domain but also keeping an eye on that interoperability and leverage provided to other domains. That means that domains get some value from participating even if no other domains participate. If other domains do participate, then everyone gets more value from each other. They started with a simple value proposition - this will make handling your own data easier - and then created a group collaboration incentive - the more domains that participate, the better the information and the less work everyone has to do to get to better outcomes :)When asked about incentivization complications around domains wanting to focus on their own goals, Marcie mentioned that as the domains are starting to find cross-domain use cases, the organization can realign goals to be about focusing on those common goals where everyone wins. What drives the most value for the business and how do we incent that kind of outcome/behavior? Scott note: this is an interesting nuance that I haven't heard before.How they found the cross domain use cases was also interesting. Marcie and team met independently with different business domains to extract what each team felt could be a good output of working with a common enterprise core data platform, as well as tying these individual value props back to the larger goal of helping more patients. The data team then literally took all those outputs and showed each business domain that uses HCP (healthcare professional) data the value across Seagen of having an enterprise core data platform. This sparked collaboration ideas between domains for how to drive even more value from the core data platform.Marcie is seeing that domains have to spend a LOT of time to cleanse and match data. And downstream consumers of their data have to do the work too as they don't know it's already been done upstream. The quality requirements for most use cases are pretty high. So creating a way for domains to much more easily interoperate data will save them a LOT of time and effort. The core data platform will hopefully prevent lots of domains from having to do a lot of that quality checking work and it will also increase quality by having more sources of information to verify data is high quality / correct - basically, the checking of quality becomes far less arduous.One thing that is working well at Seagen for Marcie and team is doing lots of demos and small proofs of concepts. Similar to doing sprint demos, teams are buying in because seeing is believing. They are showing the business real, realized value that will come from their participation. So teams are leaning in. Similar to what Karolina Henzel mentioned in episode #104, there can be a LOT of value in addressing data quality issues for domains.Marcie talked about the value of maintaining focus on a thin slice. Instead of trying to get many domains bought in on the data mesh concept, there is a specific use case and they are only working with a few domains at the start. There are clearly defined and scoped benefits. Again, thin slice. And focusing on the end-to-end solution to working with this data for the domains has also helped to get and keep everyone on the same page.Quick Tidbits:Look for ways to share knowledge in fun and interesting ways. Upskilling can be a bit intimidating, make it more gamified and less high pressure. Humanize (or in Kye's case, dog-ize) it a bit.Really embrace an attitude of learning - be vulnerable and transparent. "Explore, discover and mature data together." "Engaging conversations, exploration, curiosity, and a safe space" are crucial.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 16, 2023 • 30min

Weekly Episode Summaries and Programming Notes – Week of April 16, 2023

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 14, 2023 • 20min

#213 Zhamak's Corner 21 - Reinventing Data Development, Not Data Processing and Storage

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Key Takeaways:We need to make the data product the first class primitive of information sharing - make them the basic building block of how we create our internal data/AI ecosystem. (repeat from ZC20)"Tools reshape behavior" - if we want developers to change their relationship to data, we need to give them the capability to do so easily. Or even if not easily, at least not arduously :)The industry sees the value of data mesh but we need to find much lower effort ways to create sustainable, large-scale change. We all need to be finding catalysts.There is no reason to try to reinvent a lot of the technology in data at the physical layer. Lake storage, streaming technologies, ML libraries _at the physical layer_ are great. But we need new ways of accessing and leveraging them to make it far easier to create and manage data products.While everyone seems to be talking data products, there seems to be so many different definitions. This has led to a weaker market pull on vendors to improve their tooling to make data mesh more easily possible.Semantic Diffusion article Zhamak mentioned: https://www.martinfowler.com/bliki/SemanticDiffusion.htmlPlease Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
4 snips
Apr 10, 2023 • 60min

#212 Reflections on Building a Data Mesh Platform from Scratch - Interview w/ Jyotshna Karki

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Jyotshna's LinkedIn: https://www.linkedin.com/in/jyotshna-karki-81a24038/In this episode, Scott interviewed Jyotshna Karki, Data Engineer at Novo Nordisk. To be clear, she was only representing her own views on the episode.Some key takeaways/thoughts from Jyotshna's point of view:In data, especially in data engineering, people need to be curious. There are so many new innovations that may really be majorly beneficial. Look to try out more approaches and technologies.You can have happy data producers and consumers with a centralized data lake setup and still have data mesh be the right evolution. In the long-run, at scale, it isn't efficient to have a centralized data team coordinating all data use cases.For many domain teams, the centralized data processing and storage can be a black box. Data goes in, it gets transformed and stored by the central team and then served out. This can create a high dependency on experts and technology.?Controversial?: If your domain team consists of their own data engineers and data scientist with domain knowledge experts to manage their own data products, it's okay to work with multiple teams at the start of a mesh journey. Scott note: if you don't need to drive buy-in and your org can do this, I don't see it as a major risk. But probably at most a few hundred (tens maybe even) organizations are like this worldwide.Don't try to enable every tool as part of your platform. You should focus and create a good experience on the most widely used tools rather than trying to support every tool available out there.?Controversial?: Probably don't try to automate processes at the proof of concept stage. Wait until the need and impact is greater. But non-automated processes are typically tech debt you should look to pay down when it makes sense, don't ignore that or it will hurt more at later stage. Around best practices or things like reusable components and data pipeline blueprints, look to create centralized community sharing mechanisms but with decentralized ownership and contribution. Try to enable that sense of community knowledge sharing and trust.Create a process to assess if you need to make a change to your platform. What are the business needs and are you meeting them? Constantly look to evolve and improve your platform.KPIs for your platform are important - it's a product - but it's okay to start out pretty simple and use low tech monitoring signals like number of support tickets and customer feedback.Similarly, try to be more data-driven around building your platform. Even if that data is pretty raw and unsophisticated to start. Scott note: try to stay away from vanity metrics but it's okay to / you will probably start with vanity metrics until you understand what drives business value from the platform.Democratizing data, especially doing data catalog well, has led to less data deduplication. Because people know where to find data and can reliably get access again, they don’t copy or build something similar to ensure they have data they need."…make sure that it is reliable enough for people to depend on this data." Trust is crucial, give them visibility to see how data is handled to give them enough trust to really depend on it, not just use it.Look to community events like hackathons to drive additional experimentation and value. If you make innovation a part of your culture, good things will likely come.Jyotshna started off the conversation with a bit about her background, especially in data engineering and the need to be and stay curious. There are so many new approaches and technologies that could provide significant benefit to consider. Think in that product mindset and look to evolve your approaches and tech stack to create more value.Specific to Novo Nordisk's data mesh journey, Jyotshna and team saw the writing on the wall for their data lake setup. While their centralized data lake was doing well and people were happy with it, there were increasing consumer and producer demands and the central data team was still required to help teams create their data products. Having a centralized team in the middle of every use case just wouldn't be efficient. Then, they hit some cloud service limits which caused some major headaches as well. All this led to looking to decentralize via data mesh.At Novo Nordisk, many domains already had significant data capabilities and there were people building data products anyway according to Jyotshna. What they really needed was a way to empower and enable teams to more easily create and manage those data products in an interoperable way and lower the bar. So the central data team was to focus on the platform but there wasn't a huge need to upskill all the domains. There is still another centralized team of data experts to help domains that aren't as data fluent. Scott note: while this is not super uncommon, most organizations are not this lucky :DSpecifically to the pharma industry, Jyotshna shared some of the pre data mesh compliance/regulatory issues that were better addressed with data mesh. Domains needed to work with regulators but it was hard for them to really see exactly how the data was stored as it was managed by the central team, which is part of compliance. It was all part of the central data lake AWS account and those teams didn't have the ownership or visibility they needed. But with data mesh, the teams now have the visibility to their own data storage and access to audit logs and data governance.Jyotshna shared that at Novo Nordisk, there was so much demand to participate in their data mesh, the data platform team and any centralized data capabilities - to assist the domains that didn't have high data fluency - worked with multiple teams to start. This helped them to define the requirements for their data mesh platform to support multiple data domains. While this is a data mesh anti-pattern, it went well for them as many of the domains again were quite capable with data engineering and data analysis. There were also many domains that wanted to contribute aspects to the platform so there were good feedback loops between the platform team and many domains. Scott note: Don't go this route unless your domains are already highly data fluent/capable. Working with many domains at the start can create a high-risk scenario instead of thin slicing.Jyotshna and team are focused on enabling proof of concepts more than trying to automate everything right at the start. She noted they are focusing on understanding the problem deeply and moving fast to get proofs of concept into people's hands and then circling back to automate when there is more need and things are slightly more stable. Basically, they are being agile. It also has led to more modular components and reusability - they can get things out in a prototype phase and then think bigger picture how to deal with similar problems instead of point solutions.In order to prevent tight coupling and keep modularity, Jyotshna and team started to actually remove things like data pipeline blueprints, reusable components, and bootstrapping accounts from the data mesh platform. While that might feel counterintuitive, they wanted to create a community specifically around things like blueprints so the central team wasn't managing them, community members were managing them. Look to create central sharing mechanisms but with a decentralized ownership and contribution model. Community-led innovation is more scalable than centralized knowledge ownership.When thinking about platform maturity and if they need to pay down any tech debt, especially around certain features, Jyotshna and team benchmark quality levels and compare those to the actual business needs. Being in a heavily regulated industry, some aspects of compliance just are non-negotiable, you must meet them. But there are places where 'good enough for now' is a completely acceptable and correct answer. Some signals they use are support tickets and direct feedback around different aspects of the platform. They are also starting to build KPIs but it's a work in process. One interesting aspect of doing data mesh has been less duplication of work per Jyotshna. This is a target goal of data mesh of course but it came about naturally as now that people can reliably find and access data, they don't feel a need to build it themselves.Jyotshna said "make sure that it is reliable enough for people to depend on this data". Part of your platform and your overall mesh is to make it easy for consumers but also producers to trust the data. If you have a black box process, can producers really trust it? And evolution of your data products plays a part in trust too - a consumer can trust that the way data is presented is still relevant to the business.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 9, 2023 • 15min

Weekly Episode Summaries and Programming Notes – Week of April 9, 2023

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 7, 2023 • 20min

#211 Another One on Buy-In: Flipping the Script on Working with Your First Domain Pt 1 - Mesh Musings 46

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) here3 use case aspects or patterns to lean away from when trying to work with your initial domain:What I call the domain type 'Everyone else in the organization wants your data so give it to us'. Who is going to fund this? Not having a clear path to exec sponsorship and what the value expected will be is not great.How do I add this to my priorities - essentially 'great, but that's more work I can't take on.'And the 4 you should look to lean into:Find your initial use case that is self-contained in the domain. Move at the speed of business.Find a use case that is easy enough with enough return to pay for itself as you also invest in the other aspects.We will be investing heavily in making your domain better with data.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Apr 3, 2023 • 60min

#210 Organizational Scalability in Data Mesh - Interview w/ Chris Haas

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Chris' LinkedIn: https://www.linkedin.com/in/christopher-andr%C3%A9-haas/Article on Lean Value Tree: https://rolandbutler.medium.com/what-is-the-lean-value-tree-e90d06328f09In this episode, Scott interviewed Chris Haas, Advisory Consultant at Thoughtworks. To be clear, he was only representing his own views on the episode.Some key takeaways/thoughts from Chris' point of view:When starting a data mesh journey, you have to find teams that are actually willing to participate. If they won't truly take ownership of their data, it's essentially a non-starter. You can help convince many domains by showing them the investment you'll put into their teams on upskilling and other capabilities enhancements - it's not just new responsibilities and work.In data mesh, look to drive domain-wide understanding and buy-in, at least to the vision of what a more data capable domain means, the benefits to the domain. Not everyone will or has to care about data mesh specifically but you shouldn't have data mesh by decree from upper management.Consider lean value tree mapping - it helps everyone align on a single mission with goals to support the mission, what bets you have to make for each goal, and on down. It can help you stay focused on what are you trying to achieve.For use case prioritization within a domain, look to 1) what are the business goals, 2) what use case will have the highest value, and 3) what is the easiest to execute on. Don't get overly focused on value if you aren't ready to actually deliver it. Scott note: don't forget risk too?Controversial?: Choose a use case with a single domain if possible for your organization's first data mesh use case. It just limits the amount of challenges and potential failure modes.It's common to want to first focus on the technical aspects of data mesh - it's the tangible part and easier to see changes - but you should be focusing on the organizational aspects just as much if not more early in your journey.You can start your data mesh platform team in a central data team, that's a normal pattern. And it doesn't have to be huge, 4-5 people building and a product owner - the platform is a product too.A domain-based data product team doesn’t have to be large or require a big reorganization of the domain. Make sure they have the capabilities - data fluency and domain subject matter expertise - to execute but it can be a small team carved out of the domain.Data product teams tend to be long standing teams but they are typically responsible for developing multiple data products, not just a singular data product. Most data products don't need that level of day-to-day tending :)?Controversial?: You should consider your data product team a new team in the domain and a new team means budget. If you don't have budget and you don't rework priorities, you are putting additional work on a domain and the likelihood of success falls - look to avoid that risk.Doing decentralized data, doing data mesh doesn't mean everything is decentralized. Scott note: this is a persistent myth around data mesh. Not everything becomes decentralized or you will just create silos.?Controversial?: Once you've proved out you can do data mesh and drive value, when thinking about adding new domains, consider setting up a transformation office to coordinate, upskill, and prioritize work across domains.Look to build an internal community to encourage cross domain communication and hopefully collaboration.Finding cross-domain use cases is likely to be more difficult. Leverage that internal community to create more conversations that result in new, valuable use cases.When finding your data products for a specific use case, start from what would be the specific data product to support this use case. Then start to work backwards towards what source-aligned data products you need. And of course, look at existing data products for potential reuse first.!Controversial!: Source aligned data products should be - or at least start off - "very specific and very small."Chris started out with a rather blunt but crucial statement: when thinking about data mesh, you have to identify at least one domain that will actually take ownership of their data. A successful data mesh implementation can't be entirely IT driven. And domain data ownership and coupling that with data as a product are typically very much not the natural order for most large organizations. You will need to invest into domains to make them capable of owning their data - showing that you will invest in their success can help win them over - so you want to make sure it will be money/resources/time well spent.For Chris, it's very important for there to be at least a domain-wide understanding - and hopefully buy-in too - for what the domain is trying to do around data. That can be about data mesh or simply how they are changing their relationship to data. It won't work well to do data mesh from just upper management buy-in and pushing that down as a mandate. And that requirement can lead to deciding that data mesh isn't right for a domain and that is perfectly okay and reasonable - not every data management challenge is a nail for a data mesh hammer.Lean value trees are an important tool for Chris and team when speaking with a domain about data mesh. What are they actually trying to achieve - the mission - and work backwards from there. Break down the goals, then figure out the assumptions or bets around those goals. This helps you stay focused on what you are trying to achieve - is it deliver a data product or is it address the use case and create business value?So the output of a lean value tree helps align the team on a mission or missions which align to the business goals. Then, when thinking about use case prioritization, you need to balance those business goals, the amount of expected value of a use case, and the expected amount of work to deliver that use case*. * Scott note: it is kind of included in the amount of work and expected value but I'd also factor in risk - what is the risk of this use case not being valuable and what is the risk of the team not being able to execute well on the specific use case. Probably don't take a big gamble on your first use case ;)At the start of a journey, Chris recommends to find a use case that benefits the producing domain if at all possible. Yes, we want domains to publish data to benefit the entire organization but if a domain is going to be the test subject and invest their money, their people's time, their people's cognitive load, etc., it will be hard to find a domain willing to do that for another domain without decently strong incentivization. And at the start of your journey, those incentivization and community mechanisms are likely hard to come by. If you don't have these challenges and domains are all happy to help each other, consider yourself very lucky.Based on interactions with a number of clients and prospects, many organizations would rather focus on the technical aspects of data mesh first over the organizational and Chris wishes that weren't the case. While building out the technical aspects is no easy task, if you aren't ready to actually do the day-to-day work, what are you building the tech to support? Scott note: this is extremely common and is also a very common comment from consultants. The tech feels more tangible and it's easy to say yes/no than squishy operating model discussions. But they are crucial to doing data mesh right.Chris believes an early data mesh alignment on the organizational model doesn't have to be - and shouldn't be - disruptive. You can have a domain start to carve out new ways of working without doing a major reorganization. There should be a team building the platform and a product owner for the platform but 4-5 people building is reasonable and they can live in a central data team, that's a normal pattern, no reorg needed. As for the data product team, you want them to be part of the domain if possible and it should be a mix of highly data fluent people and domain subject matter experts but again, it can be a small team. So a small team carve out but not realigning the entire domain. Scott note: remember domain is an overloaded term. Some domains are sub-domains that are 3-5 people but typically we mean the line of business which can be 10K+ people.For Chris, when asked about data product teams, he said you should look to have them be long-standing teams. You need an owner per data product but often the development teams will be charged with creating multiple data products rather than only one. Especially as you build the knowledge of how to build good data products, that team can become more efficient. But do not treat your data products like projects, they should be managed/evolved by long-standing teams, otherwise they are more likely to fall to data disrepair.As a consultant, Chris recommends working with a consultancy :D but the point is that your data product team in the domain should be considered a new team and that new team needs a budget. It might be reallocated budget but hopefully it's new budget. But the consultancy angle is that if the implementation doesn't deliver value, it's easier to cut the consultancy than reassign or lay-off employees. Scott note: I think this is a valid concern but many can't get the budget and many organizations are doing this work entirely internally.Once you've proven value from and a capability to do data mesh, a typical pattern is to set up a transformation office that will assist other domains in their journey according to Chris. That way you can maintain coordination, find reusable patterns - tech, architecture, people, process, etc. patterns -, prioritize, upskill, etc. That office is helpful in getting teams up to speed but also setting expectations - domains won't magically transform their data capabilities overnight. And look to build an internal community function to encourage cross domain communication and hopefully collaboration.For Chris, that community aspect is really crucial to identifying cross-domain use cases. The data product owners should be communicating with each other to discuss what they've created in case that sparks new ideas for new use cases. For finding what data products you need to support a use case, start from the mission, then the consumer-aligned data product that would best serve that mission. Once you know what you'd want to have as the end product, you can start to find the necessary source-aligned data products.Chris wrapped up with a few useful tidbits:Don't worry about trying to serve future, unknown use cases with data products you are building now. Build for reuse and build for evolution and then evolve when those new use cases emerge.Source-aligned data products should be - or at least start out - "very specific and very small." Don't try to cram too much in. Scott note: I think this can create some discoverability issues but it is a pattern I am seeing more. See Carlos Saona episode #150 for how that looks in an implementationData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app