Data Mesh Radio

Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh. Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out! Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above. If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/ You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/ Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio

Latest episodes

Apr 2, 2023 • 20min

Weekly Episode Summaries and Programming Notes – Week of April 2, 2023

Mar 31, 2023 • 60min

#209 Panel: What is BI's Place in Data Mesh - Led by Ammara Gafoor w/ Elif Tutuk and Ryan Dolley

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Ammara's LinkedIn: https://www.linkedin.com/in/ammara-gafoor/Ammara's Articles on data mesh Part 1: https://www.thoughtworks.com/insights/articles/data-mesh-in-practice-getting-off-to-the-right-startPart 2: https://www.thoughtworks.com/insights/articles/data-mesh-in-practice-organizational-operating-modelPart 3: https://www.thoughtworks.com/insights/articles/data-mesh-in-practice-product-thinking-and-developmentElif's LinkedIn: https://www.linkedin.com/in/elift/AtScale article on data mesh: Principles of data mesh and how semantic layer brings it to life: https://www.atscale.com/resource/data-mesh-principles-semantic-layer/General AtScale article: The Semantic Layer’s Critical Roles in Modern Data Architectures: https://www.atscale.com/resource/the-semantic-layers-critical-roles-in-modern-data-architectures/Ryan's LinkedIn: https://www.linkedin.com/in/ryandolley/Super Data Bros Blog: https://superdatablog.substack.com/Super Data Bros YouTube: https://www.youtube.com/@superdatabrosIn this episode, guest host Ammara Gafoor, Principal Business Analyst at Thoughtworks (guest of episode #133) facilitated a discussion with Elif Tutuk, Global Head of Product at AtScale, and Ryan Dolley, Independent Data Consultant and one of the SuperDataBros (guest of episode #183). The focus topic area was what role does Business Intelligence (BI) have in data mesh - e.g. where does it sit and who owns it - and how do we enable BI to really drive significant value in a data mesh implementation.A few other episodes that would be good to get a broader picture here on related topics in addition to Ammara and Ryan's episodes are #199 with Brent Dykes and #192 with João Sousa.Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views. This will be the standard for panels going forward.Scott's Top Takeaways:It's really important to actually define what business intelligence even means to your organization. If everyone doesn't have a clearly defined picture, it's one of the easiest things to have a major mismatch on expectations - everyone thinks they know what we mean by BI but we often mean different, sometimes vastly different, things.In data mesh, it's okay to have multiple org groups/layers of BI across the org - say BI teams embedded in the domains and a central BI team too - as long as responsibilities are relatively clear and teams _communicate_. If everyone is working on similar or overlapping goals, it's going to create BI sprawl - e.g. dashboard and report sprawl with weak ownership.BI must focus on enabling business users, not just data analysts. How do we make it so regular business users can actually drive their own analysis and then own the output? How can they then share their insights back to others.It's important to focus on enabling BI capabilities over tooling. In BI that feels especially hard because the interface to the data for most people is literally the tools themselves so of course tools feel like they are most important. How we make that distinction, especially to users, is hard and seems to be to-be-determined.Really consider if BI is only a consumer or, more likely, is it also a crucial producer to the mesh. Are data analysis outputs going to be made available in something like an analytics catalog? Whether you call that part of the mesh or not, how are people reliably sharing the insights they create and how are others discovering those?Ammara brought up the complexities of how do we go from providing raw data in these fundamental data products to actually doing BI - the tools are not designed to ingest from many sources and the BI professionals are typically not super technical. Do we create domain data marts or mesh aggregated data products? I like the recipe angle Mahmoud Yassin mentioned in episode #103. Data virtualization technologies also probably play a key role here.Elif made a good point about there is a difference between making the data analytics ready and business ready. How do we bridge that gap? How do we even define that gap so we can recognize it before addressing it?Design thinking and product are truly crucial to doing BI right, whether data mesh or not. For too long, all aspects of data but especially BI have been focused on outputs - kind of like # of widgets produced - as the metric. It can feel like accomplishing something even if it's not creating value. We need to focus much more on how people will use what we build to drive specific value. How do we create "delightful experiences" as Zhamak always says for consumers of business intelligence.Other Important Takeaways (many touch on similar points from different aspects):BI is about "translating the things that happen in the business … into the format that's going to enable people to make informed decisions on it…" as Ryan said. It's easy to get bogged down in tools and techniques but at the end of the day, we're trying to provide intelligence to make better decisions, that enables people to understand and take reasonable actions.Data mesh gives us a chance to reinvent how we produce BI dashboards and reports. Instead of creating single tables to support each dashboard, we can easily source reliable data without creating rigid and quickly deteriorating or abandoned data assets.Balancing user experience - especially letting users use their tools of choice - and reliability/scalability is very hard in BI. How do you track lineage into Excel? How do you _govern_ data that's been put into Excel? But if you try to get everyone into Power BI, will you have user disengagement? Scott note: you won't even be able to pry Excel from my dead hands. It's not for reporting but it's the best data poking tool around for small to moderate data setsWe need to start to think about how our mesh data products interface with business users. At the mesh experience plane level, how can they have what Zhamak calls in her book "a delightful experience"? In general in data, we often overlook the rank and file users to focus on the highly technical ones - those are more 'our people' - but we will miss out on a ton of business value if we do that with data mesh.Is a centralized BI team a mesh anti-pattern? I agree with Ammara and her colleague Emily Gorcenski: nope! You need someone really focused on generating and owning cross-domain use cases or we focus on preventing data silos for nothing. If we don't actually combine the data across the domains, what's the point of data mesh?! Just make sure to share information with other BI teams to prevent knowledge silos.There is a lot of focus on who sits where and exactly how things interact, which is important in some respects. But it's more important to focus on what are we trying to actually accomplish and then who owns that. Clear responsibilities win, the org chart doesn't! :)Governance is a key enabler in BI. Not just defining who does what but creating common and easy to leverage interfaces to our data. That can't be done individually for each data product - it would be a ton of work on data producers and consumers would have an awful time. So you have to consider how far does the mesh experience plane extend. Is it into BI or not? If not, how do we achieve scale if business consumers have a disjointed experience and can't easily share with each other.Mesh consumer-aligned data product owners need to consider if BI users are a target consumer. What output ports are you creating to make it easy for BI users to leverage your data product?How do we think about the semantics and metrics layers relative to BI in data mesh? BI users want to be able to trust data without deep-diving into transformations and lineage. How do we make that simple and easy? I really don't have good answers here.Zhamak has talked about her disdain for layers specifically for data in a data product and the large-scale data pipeline approach. I don't believe this extends to the concepts of a semantics layer or a metrics layer if you use this as something data products need to produce their metadata into. It's another aspect powering data discoverability. It's kind of a complicated analogy for how this might work and I am working on the wording…Governance has to be focused on enabling BI usage of data more than restricting. Of course, controlling who has access and what people can do with it is crucial but with mesh, if we are so focused on producing great quality data across the org, we have to think about how we can enable usage at scale. That's table stakes for data mesh.We need to create good and low friction ways for people to go from generating an insight to sharing that insight. It's often that insights get locked in at the BI layer. How do we share those insights back to others on the mesh to leverage? Can we easily push insights into the mesh to create a new data product? This is a cultural AND a technology issue that needs to be addressed.In BI, there is often too much of a bias towards 'give me the complete picture and I'll go from there' which leads to complicated and non-effective customer 360 approaches. Just sourcing a bunch of data without a specific business objective. Humans are curious creatures! Do we need to focus on what information are you trying to understand and why first rather than produce all this data and I _might_ have an interesting insight? I think yes.One thing that _feels_ like it is often missing from a BI strategy is bringing the business knowledge to the data instead of only bringing the data to business users. How to actually go about bringing the business knowledge to the data, I have no idea. Maybe this is the consumer-driven data modeling and feedback involved in data product creation but it feels deeper than that. It's a concept that is brewing in my mind…Do we want to have some org-wide defined metrics? Taxonomies and ontologies can easily become overly rigid but having optional adherence can make things much easier when you think about standardized metrics and meanings. It's all a balance and it feels like you should be constantly assessing it because it can easily go the route of too many metrics and your mesh becomes littered with metrics that all mean slightly different things.BI people are likely to be skeptical of data mesh. Trying to get access to tons of systems and cross-correlate and consume data from multiple sources has historically been a pain - probably an insane pain to be honest. Be ready for pushback even if this is a better solution in the long-run - and likely short-run - for them. Show them how BI in data mesh can be better, that their focus is on value.We need to get BI people on board with data mesh and what it offers and how it will be better for them in the long-run. BI people are often the data gateways to execs. If the BI people aren't on board, are your executives going to see the benefit of data mesh? Not just buy-in wise but are they literally going to see an improvement in their business operations from data work if the BI people aren't leveraging it?In general, we need to get better at translating the business need into the BI need then into the data need. Data work is often not as valuable as expected because it gets divorced from what people actually care about. Far easier said than done of course but think about user experience more relative to the entire data consumption and analysis / BI process.Churning out reports and dashboards is not a good end-consumer user experience. What do business users get out of it to drive their business? We need to tie more BI work directly to what would drive change, what would drive action if we found out new information. We can start by just looking at who is using reports/dashboards before getting more complicated.Product thinking in BI isn't just about the consumer experience, we need consumers to be part of the conversation and giving more feedback on usage and importance of the BI work. Instead of just having information pushed at them, it needs to be a collaborative push/pull.Ammara asked about data mesh and BI teams: "…how does data mesh not add to the complexity? How do we not add more strain to the teams? … How do we not add to more to their cognitive load?" How do we make BI feasible in data mesh? It's not as obvious of an answer as it feels initially. We want to get more value leverage, not more work, from the teams.Data virtualization might be a key aspect of doing BI well. Ghada Richani (episode #206) raved about what data virtualization has allowed them to do as well because it's crucial to easily be able to ingest data you are supposed to have access to. And many tools aren't designed to take data from lots of sources. I have lots of places where I am seeing data virtualization misused but this one seems safe :)Some people are better served by feeding them reports and dashboards than giving them advanced self-serve analytics capabilities. Do we need to level them up? Do we want to move past "first wave" type practices? I honestly don't know. Some people like to cling to ways of the past but it might also be serving them well. If it's scalable and reliable, do we need to push to 'modernize'? Probably? A bigger BI question than just data mesh related :)Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Mar 27, 2023 • 1h 14min

#208 Making Prioritization a Priority and Focusing on Delivering Value - Interview w/ Srinivas Paluri

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Srinivas' LinkedIn: https://www.linkedin.com/in/spaluri/In this episode, Scott interviewed Srinivas Paluri, CTO at Rentbase. Srinivas was previously part of a data mesh implementation as the Senior Director of Data Engineering at Zillow.Some key takeaways/thoughts from Srinivas' point of view:Data mesh advice to former self #1: Ambiguity is inevitable. Don’t be afraid of ambiguity - it's often better than binary thinking - but also be as clear as possible on responsibilities even if the target outcome is ambiguous. Clear responsibilities at least drive things forward.Data mesh advice to former self #2: Involve product management way earlier. Every product owner needs to understand product ownership, prioritization, and what is the value to the business. And then communicate the value of the work to the business. If that value's not clear, why are you doing the work?Data mesh advice to former self #3: Create a small team, maybe 5-6 people, focused on enabling new domains to learn how to own their own data and create data products. Scott note: see episode 48 for how ITV is implementing this patternPrioritization - and communication around prioritization - is probably one of your most useful tools in data mesh. If you get good at that, teams will often buy-in more quickly. Data producers see changing priorities, not more work. Consumers have a clear understanding of what work is done when and why instead of silence or a link to a Confluence or JIRA page.Good data mesh product management isn't only focused at the data product level or the platform. You need to look at the bigger picture of how everything fits together to drive even more value. Make sure you have thought coverage of mesh-level product management. Make sure everyone is aligned on how data engineering work supports business value and the organization's goals. The exec team should understand it of course but don't skimp on informing the data engineers - if they know more, it can help with better prioritization.Data ownership - and buy-in on that ownership - is crucial and can have immense value. You can drive that buy-in by showing people the consequences of centralized data ownership creating bottlenecks. What more value could have been created if things weren't stuck in the JIRA logjam?To communicate the cost of data issues, it's very hard to try to put dollar amounts on them but you can show people the business impact and they can start to calculate it themselves - 'this report takes 5 people 3 days to clean up at the end of the quarter' - that has a big impact on business people who can calculate costs far better than the data team.Work with teams to assess and document wins with data. You might not get to exact numbers but you can show value with a little bit of partnering. Then share those wins and attract others that want some of that data-enabled value creation :)Look to business partners for setting prioritization around data work. See if you can get them to seek the budget to do the work with you. What are the company's priorities and where should you be investing to support those? How does data work play into that?Set yourself up to learn and iterate when it comes to architecture. Prepare yourself to fail, and fail fast. It's way too hard to try to get everything right up front, you'll expend too much energy. Embrace that and get moving.Engineering teams need to be part of key business decisions. Not necessarily as a decision maker but engineering finding out about changes once systems break - this is a pattern that has to change. If you want to be data-informed, data-driven you have to consider the impact to data as part of the product decisions/changes.You should make sure you support new data owners with technology and capability building. You need to make them capable of owning data, not just give them the responsibility. And that won't be free.In data mesh, seek out teams that will give you blunt and honest feedback and are willing to work with you to improve your data mesh implementation. You want people who will push you to do better and share in the pain and the success.Focus on communicating what data mesh changes for your organization, what value it unleashes. Really talk with your central data team and explain what it means for them and that it's not a threat to their jobs, they get to focus on cooler things than building pipelines.?Controversial?: Don't look to rebuild your entire tech stack to do something like data mesh. See how far you can get with a lot of what you have already.Srinivas started by saying how important architecture is, yes, but how it's too hard to try to get everything right up front. Instead, set yourself up to be able to change your architecture as you learn more. Fail fast is crucial, you need to be able to get moving sooner rather than later. Otherwise, there is too much risk in building something that doesn't fit needs or even more likely, never building anything because you're always waiting for the perfect technologies.To do engineering - especially data engineering - well, Srinivas believes engineering needs to be part of the decision making process. If not providing input, at least then being aware of strategic shifts and working with key stakeholders to shift systems to better align to new changes. You can't change your business model and not expect it to not require a shift in what data you need and how you work with it! And you could have started collecting data sooner for the business model shift too :)As data mesh aficionados know, centralized data engineering can create bottlenecks - and almost certainly will at scale. Srinivas recommends you show people the impact of these bottlenecks through specific examples. You can use that to drive better buy-in for decentralized, domain-based data ownership. It can be hard to quantify the exact impact but it's important to try to at least communicate the outcome of those bottlenecks, how did it impact day-to-day business operations and capabilities.When trying to push data ownership to domains, it's crucial to make sure you give them the support to actually own the data per Srinivas. That's technology and that's capabilities/understanding. If you are saying data ownership has value and you can show the value, then the organization should support that value, it's not free, it's worth investing in.Srinivas believes you should focus on making gradual changes rather than sudden shifts in a data mesh or other large-scale data implementation. There needs to be commitment to making change to do it right. And as part of that, look to create and foster close collaboration with users. You need teams to be blunt and honest to help you get to where you need to be, to get to a valuable outcome. That feedback will help you improve your processes and platforms.If Srinivas could give 3 pieces of advice to his former self about data mesh:#1 On ambiguity: it can be helpful rather than binary right or wrong type thinking. So get comfortable with ambiguity, the world is rarely black and white. But ambiguity should be more about the ways of achieving or the expected end result - make sure to set responsibilities as clearly as possible because ambiguous ownership rarely works out well.#2 Get product management involved earlier. They should be there from the start. Understanding product ownership is so crucial to getting data mesh right. You need people who are focused on prioritization and focusing on things that are valuable to the business. If it doesn't have a clear value, should you be doing it? And then when you establish value, communicate it!#3 Look to create a data mesh/domain enablement function. A team that is specific to helping additional domains figure this out. It's not easy and you will find great advocates that can really help teams get moving quickly. A team of 5-6 people would probably be a good enough size. Scott note: see episode 48 with Scott Hawkins for a great example of this internal consultant team in a box approachAnother thing Srinivas learned looking back on his time at Zillow is to focus on communicating to people what does data mesh change for the organization and especially for them. There is a vague sense of data mesh changing the way we work but what is the actual value we expect to drive. Not a specific dollar amount but what's the vision of the organization of once you reach a relatively data-informed, data-driven state? Faster reactions to market changes? Better identification of new opportunities? Significant cost savings? And again, talk to people about what changes for them, driving value and also responsibility/role wise.If you want a somewhat visceral approach to showing people why the central data engineering team has become a bottleneck, Srinivas recommends asking how long would it take to clear your current backlog with your current team if no new tickets came in. What about how big of a team would you need to actually clear your backlog based on the number of tickets coming in - is anyone's backlog actually decreasing? What could your organization be doing if you didn't have that bottleneck? What more value could you be creating by really enabling teams to understand and leverage the organization's data? Yes, there will be a cost but we have to invest to create value :)When driving data mesh buy-in, Srinivas again went back to the need for product management in general. Talking with the producers about why this matters; how you are shifting their prioritization, not just adding additional work; how they can actually achieve this and what are meaningful milestones / incremental value deliveries; etc. Prioritization - and communication around prioritization - are likely to lead to your easiest, happiest paths.Data mesh product management should not only focus at the micro level - the data products and the platform - but also at the mesh level according to Srinivas. To really drive value at scale, you need interoperability and things working in harmony. Product management is not just about managing the product but the entire suite of products. Think about your data products as an overall suite of information to serve many use cases and drive a huge amount of value.For Srinivas, when looking at setting your priorities for data work, start with what are the overall organization's priorities, what are the priorities of your business partners? If you want to do data work that isn't valuable to them, will it be valued even if it drives the expected value? Look to find data work that are both valuable and valued and that support the overall organization's priorities. Scott note: this is especially key as you are getting going on a data mesh journey.Everyone on the data engineering team should understand how their work supports the company's priorities and drives value for Srinivas. And the exec team should understand how the data engineering work maps to value creation too. Sometimes, that can be harder with platform work and the like, but it's important for everyone involved to understand how the data engineering work drives value. That helps map to prioritization decisions as well.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Mar 26, 2023 • 33min

Weekly Episode Summaries and Programming Notes – Week of March 26, 2023

Mar 24, 2023 • 19min

#207 Zhamak's Corner 20 - Crossing the Data Value Chasm

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Takeaways:People are seeing early value from data mesh but we can see where it could be a much greater amount but we are still held back by the organizational challenges and even more by the tooling.In many cases, the tooling isn't good enough yet to change developer behavior - if we remove friction, they will likely want to lean in more on data mesh.We have to find catalysts at the micro level in data mesh to make massive shift at the macro level. We can't try to change everything through pure force of organizational will. But we haven't found these catalysts yet.It's easy to get lost in the vastness of change in a data transformation around data mesh. Try to focus more at the micro with a goal of creating cascading reactions to drive the macro.We need to make mesh data products the first class primitives of information sharing - make them the basic building block of how we create our internal data/AI ecosystem.Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Mar 20, 2023 • 1h 18min

#206 Learnings from Delivering and Then Measuring Value of Data Mesh Work - Interview w/ Ghada Richani

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Ghada's LinkedIn: https://www.linkedin.com/in/ghada-richani/In this episode, Scott interviewed Ghada Richani, Managing Director, Data, Analytics, and Technology Innovation at Bank of America. To be clear, Ghada was only representing her own views on the episode, not that of the company.Some key takeaways/thoughts from Ghada's point of view:"What I know today is going to change tomorrow." Data mesh is a journey, don't try to get too comfortable, we should always be trying and iterating. Be an explorer. Scott note: strong agree and so does ZhamakSpeed is always a challenge with data mesh - some want to move too fast but others want to boil the ocean to make everything that comes after extremely fast. Work with people to take them along the journey and be part of the decisioning process, don't get ahead of yourself.Take your stakeholders lockstep along your journey, keep them very informed. Let them control prioritization. That way, they can see what changes are happening and why delivery timelines are extending - they made those calls!Expose the evolution of the data product to stakeholders. They can then understand tradeoffs, especially in regulatory or other governance challenges. And again, they own the prioritization :)Don't start with generic requirements, start with stakeholder deliverables. The requirements will emerge from that conversation - stakeholders don't necessarily know what is required technically or as a product structure, but for sure they know what they want to achieve from a business perspective.In data mesh, data producers should be treated as a stakeholder as well. Make sure they are engaged and that they are getting something from the process. That can be credit/visibility for value creation, true ownership instead of demands/requirements, additional insights about their own domain, lots of information about why this matters, etc.When driving buy-in and/or getting approval, you have to know your audience. That might seem obvious but it's really not. What do they care about? Have you actually talked to them about what they care about? How can you win them over? How can you make it make sense to them and excite them?When trying to get approval for a big project, break it down into tangible pieces. If there are 20 things that could be improved around a high-value process, look at them discreetly and deliver more and more of the 20 over time. You get a budget to fix 1 or a few and then prove out value and get a budget for more.If something is a priority for your team but it isn't for another team that you need to partner with, lean into that friction. Why is it a priority for your team and not their team? Should that be pushed up the chain to align better? You can't make it a priority by pressing them, you need to work the right levers.?Controversial?: One lever you can pull to drive data producer buy-in is the opportunity cost of not acting. There is a reason for a use-case, something that will improve a process, drive value. If you ask the reluctant data producer to own the risk, the cost of not acting to improve that process, most will say no way and will participate.Measuring the value of data work is very hard and pretty imprecise. It must be a collaborative process with stakeholders - they are the ones who derive the direct value from the work. How much value is there in speeding up time to deliver data products by 40%? Your data/platform team can't know.It's crucial to build an environment where data failures fall on multiple stakeholders. And that a data product or other data work not meeting the expected value isn't necessarily a bad failure - your hypothesis maybe simply didn't prove true. Limit the costs and size of those failures but if you aren't failing, you probably aren't taking on enough risk.?Controversial?: For high profile, high visibility projects/data product builds, check-ins 2-3 times a week is normal and is often helpful. You can identify and attempt to mitigate challenges and risks as they emerge.?Controversial?: Create a highly visible accountability model for stakeholders and make sure they are aware. It's not about calling people out but holding them accountable and if they aren't doing what's necessary, the executive sponsor should know and can address it.Virtualization at the query layer has allowed Ghada and team to mature the underlying data products over time while still presenting a mature and complete data product experience to users. Without virtualization, data consumers would not be nearly as happy with the data mesh implementation.A data virtualization layer has made interoperability easier as well. Connections between data products still need to be found but then they can be codified and offered as a view for others to use.Discovering and establishing domain boundaries is crucial in data mesh. Sometimes, "orphaned" data that isn't really owned by a data domain might live there temporarily but you should always be looking to find/create a true owner if people are relying on that data.Your domain boundaries will change and that's okay. Be ready, be vigilant for measuring if it's time for boundaries to change. There isn't a silver bullet way to approach this but with experience, necessary boundary changes will start to become more obvious.It's okay to have more than one data product per data domain but make sure they are truly incremental to each other. The boundaries and the governance are more important than the number of data products in a domain.Every data product should have a specific purpose. Not serve only a single use case, we need data reusability. But don't add too much scope to a data product.Ghada started discussing balancing speed, structure, and control in your data mesh implementation. There are those that want to build everything upfront and boil the ocean but there are also those that want to get to value as soon as possible without taking the product mindset to heart. Work with both sets of people to keep them deeply informed and show them why a balanced approach works better. If stakeholders are very close to the journey, they won't be pushing back on timeline - they can see where prioritizations are changing and the learning is happening as data products - or other aspects of your mesh - are being built. In fact, let them control prioritization where it makes sense so they are the ones causing timelines to stretch and they made the tradeoff decisions.Keeping stakeholders closely informed also has benefits around control in Ghada's experience. They can understand the tradeoffs relative to governance challenges like regulatory compliance. Exposing the actual evolution of the data product itself to stakeholders helps stakeholders feel comfortable with the process and that compliance/regulatory concerns are addressed. For Ghada, starting from requirements for a use case doesn't work well, people aren't sure and they get stuck in the details instead of the big picture. Instead, work with them to focus on what they are trying to achieve, what their deliverables are, and then work backwards to figure out what they need to meet their own deliverables. And those deliverables better be tied to value somehow :)While driving buy-in from data producers, Ghada recommends making them a clear stakeholder in the process. She's found that really deeply informing them of how their data will be used and the value it will drive often gets them excited to participate. Of course, you need to work with them to prioritize the work but showing them the value - or potential value - of a great use case often helps set that prioritization. You also want to make sure to highlight their work, either for them or preferably making the stage for them to present the value delivered from their data, giving them credit and visibility. When you do those things and give people true ownership, not just requirements, many data producers are far more willing to get involved. In general, when trying to get approval for data work, while Ghada recognizes it can be hard, she has a few good approaches. One is to look at different aspects of what you are trying to improve. Say a process or product line drives significant value for the company. What could you do to tangibly improve the value it delivers? Not as one giant project, break it down into more tangible improvements and seek a budget to tackle one or a few so you can prove out value, getting a budget for additional improvements. Another is to know your audience. This might seem simple but really, you have to learn what drives your counterparts and find a way to communicate the benefits in their own language and address something that matters to them. Make it digestible and hard to resist wanting to tackle the challenge. It's definitely more art than science.One way Ghada has found to drive buy-in from reluctant data producers is to assign the cost of not doing something to them. Essentially, there is a benefit, a value to doing the proposed work - whether that is increased revenue, decreased cost, decreased risk, increased speed, etc. So, there is a negative of not doing the work the proposed work and you ask the reluctant data producer to officially own the cost of not doing that, own that business risk. Many have become far less reluctant to participate :)For Ghada, there are two ways in general to measure the value of data work - economic value and impact value. Economic value is slightly easier to conceive if not that easy to measure - if you make improvements to a process or say create a new product line, you measure the incremental revenue it drove or the amount of cost savings. Impact value, the team(s) impacted by the changes have to give the value measurement - what is the value of speeding up a process, improving the data quality, lowering the associated risk, etc. Neither are exact measurements so it's crucial for stakeholders to understand that it's about triangulating and assessing value, not an exact amount of return. And the stakeholders again have to be the ones that assess value. Only they can say what an impact would mean for them, the data team doesn’t have the context to do that. And you need an organizational environment where the forecasts are seen as forecasts, not commits.It's okay to have failures in data work, according to Ghada. As many past guests have also noted, experimentation is about trying, learning, and iterating. Sometimes the learning is that this won't work or isn't worth the effort. Getting to that learning quickly and iterating to value - or stopping work when that's the right call - is crucial to driving significant value from data work. Your culture must allow for failure or you just won't take on initiatives that are higher risk but higher reward and where the reward justifies the risk. You need to see getting to value and getting something directionally right as a win so you can iterate towards more value.In Ghada's experience, for high profile, high visibility, high intensity projects/data product builds, it's not unusual to check in 2-3 times every week with all the stakeholders. While it may feel like overkill, you can find miscommunications or friction early and even more importantly you can identify and work to address challenges and risks as they emerge, e.g. if someone is disengaging. Instead of the data team going off and doing a bunch of work to deliver at the end of a months-long project, it's tight feedback loops and iteration and changing priorities through close collaboration. And have a highly visible accountability model - if someone isn't delivering, that should escalate to the executive sponsor to figure out prioritization and an appropriate response.On the platform side of things, Ghada is very happy with their use of data virtualization for their virtual query layer. As teams have learned how to build and mature their data products, data virtualization has meant they can expose what a mature data product looks like even when the underlying data product is not yet mature. The underlying data creation and curation process is not fully productized or robust in many instances but consumers don't have to care. The views presented to users are controlled by subject matter experts and serve as a type of interface or output port of a sense. More on data virtualization:1) Sometimes, that virtualization layer can lead to query performance challenges but usually, that's tied to someone trying to do too large of a query all at once instead of breaking it down appropriately.2) Data virtualization has made exposing connections between data products much easier. It's just creating another virtualized view. Connections need to be discovered/surfaced manually but beyond that, it's quite easy to do interoperability if the data fits well together.Discovering and mapping domain boundaries is really crucial in data mesh according to Ghada. And it will get easier as you go along. You really want to consider what are you trying to accomplish with a data product and not have many things loaded into one data product, or it will become overloaded and be hard to evolve/improve well. Data owned by a team that are not the subject matter experts is a likely occurrence but you should look to rectify it quickly. Teams building data products that consume information from upstream data products should not take unnecessary dependencies. At BofA, they create a virtual view that combines the upstream data from the source data product with the data product rather than that downstream data product taking a dependency. They are also having many domains that are represented by one data product but some have more than one data product. The boundaries and the governance are far more important to get right than trying to match a certain number of data products to domains.Every data product should have a defined purpose in Ghada's view, that's how you can find your data product boundaries. But a data product should also not take on additional purposes, that's scope creep. That doesn't mean it can only serve a single use case, reusability is crucial but when someone tries to find the right source for accomplishing a goal with data, it's best if they have to consider fewer options but still get all the data they want/need. Yes, easier said than done :)Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Mar 19, 2023 • 22min

Weekly Episode Summaries and Programming Notes – Week of March 19, 2023

Mar 17, 2023 • 20min

#205 The Gartner Drama Files - Mesh Musings 45

Mar 13, 2023 • 1h 8min

#204 Driving Towards Data Driven - How to Add More Data to Your Org's Decisions - Interview w/ Stephen Galsworthy

Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Stephen's LinkedIn: https://www.linkedin.com/in/galsworthy/In this episode, Scott interviewed Stephen Galsworthy, former Head of Data at TomTom. Obviously, given he's no longer with the company, he was only representing his own views.Some key takeaways/thoughts from Stephen's point of view:Just because the products you sell are made from data, that doesn't mean you necessarily have a good process for leveraging data to make your products better. You need to embed information collection into how you create products, how customers interact with your offerings.The AI Flywheel: If you can get good information on product user experience, you can feed that into AI systems that generate incremental insights to improve your user experience even more. Hopefully, that generates more users which generate even more information to improve even further.If data collection on usage is not part of your business model - for a number of reasons - it can be hard to convince customers and/or partners to enable that data collection. Even if it's simply to improve the user experience. Try to add it in to your product development as early as possible.If your organization is data hesitant, look for existing success stories from data. Look for something that couldn't have happened without the data. And then share that success internally to drum up more interest.?Controversial?: Data should rarely be THE deciding factor. Data should be a touch point that can strongly inform or give clarity. Use it for giving clarity or measuring as you iterate. Help execs understand it's not magic and that it's not all right or wrong.It's easy to trust data when it confirms your intuition. How do you use it when it doesn't? How much credence should you give the data??Controversial?: High performing companies tend to be those that use data to help make adjustments to their product strategy, using it as a tight feedback loop. It's not as though the data decides but it constantly informs and confirms (but yes, not absolute confirmation).Don't fall into the trap of collecting all the data you can. But do think about what you could do if you had additional data, what might that inform. Work ahead, you don't get past data the day you implement.There is a big difference between producing data and producing data as a product. But if we don't incentivize and assist teams to produce data as a product, few will and then your data practices will remain fragile. Data producers need to be able to understand who has taken a dependency on their data but it's too hard in general right now for them to understand. Better technology offerings here can help.A great - possibly the best - incentive for a team to produce their data as a product is what their data consumers can provide back to them in the form of insights. How can a team managing an app get more insight to improve their app by sharing their data? That's a cohesive org-wide data practice instead of domains only focusing on themselves.'In production' should not be seen as the end or the key goal, that's a project mindset. In production just means the start of a new phase for your data product.To get a team to data-driven, find a data ambassador within a team - and if there isn't one, deploy one into that team. Really encourage someone to be the lead of wrangling their data. Work to develop a passion for data in someone in the team to spread to the rest of the team. Scott note: staying away from infectious analogiesThe most data-driven teams have a mindset of always be experimenting. It's not about being right at the start, it's about constantly trying, learning, and getting better. Stephen started off with a bit about his background from working for companies consolidating information into a product they sell, such as TomTom taking in tons of data to create maps to sell. However, using data as a key aspect of your products doesn't necessarily translate into collecting and analyzing information about how your products are actually used which would allow you to drive improvements like new features or even new products. Organizations that are focused on obtaining and analyzing information of usage and other market dynamics are likely to be those that win in the market more often going forward. Their user experience knowledge is going to be a much tighter feedback loop.The AI flywheel is something Stephen mentioned that can create a virtuous cycle. You are creating data around interaction points with your products that feed AI to make the products better. The better they are, the more people interact with them generating more interaction points, meaning you can make your product even better. Essentially, collecting and analyzing data to make your product better -> you make your product better -> more people use the product -> more data to analyze to make your product better. However, if it's not inherently part of your business model, trying to pivot to that information gathering practice can be a tough sell for customers and/or partners. Oftentimes, partners aren't even allowed to hand over data. Think about how you'll effectively collect data as early as possible even if you don't start collecting it then.A good way to build momentum around the importance of data to an organization is start small according to Stephen. It might sound great to try to convert the entire organization to being data driven at once but Rome wasn't built in a day. Show some successes from working with data, find something that couldn't have been done without leveraging data like a new product, and then share those successes internally and encourage more parts of the organization to try leveraging data. Stephen believes that data should rarely be THE deciding factor in decisions. The human in the loop is crucial. And it's crucial to make that clear to execs - you don't want them to think data is a magic wand / silver bullet but they also don't want the data making the decision. Data is a touch-point in decisions, it's used to create better feedback loops as you iterate towards good solutions. Leveraging data well often isn't really about making the right call, it's about finding better and better ways.It's very important to frame the role of data in your organization well - it's what differentiates high performing organizations in many cases based on Stephen's history. Again, it's not a silver bullet but data makes it so you can make smaller bets to quickly get to a better solution via tight test, learn, and then iterate loops. Again, you want to make data a companion to execs instead of an either/or to their intuition and experience. Data can help them arrive on the right decision.Stephen laid out a data maturity journey for an organization or a team: you need to start with collecting data - it seems obvious but you need mechanisms to actually collect data as noted earlier. Once you start collecting data, you don't need to become the most data driven team in the world overnight; start to process and analyze the data - find bits of information and insights to assist as you decide if there should be a formal data analyst/data scientist or not. From there, drive to faster experimentation and improve. The faster you can get solid information and iterate, the better. That will naturally lead to more data democratization as people see the value of the data and want to get involved. That can easily build from a team level up to larger organization level as well. But it's a journey, it's not a switch to flip.Being cognizant of the cost of data acquisition has been relatively easy for Stephen since his past employers have been involved in hardware/IoT and there was a distinct cost to pipe the data back into the organization. But it's easy to lose sight of the cost of data collection for many organizations. He recommends looking at what data you want to collect by specifically what you want to figure out and why it might inform a decision. Just collecting data for the sake of data isn't good. Scott note: you really want to have these conversations as early as possible because the earlier you have the data, the more you can shape your decisions. If XYZ metric will be crucial for a decision in 6 months, you don't want to start collecting the data for XYZ in 6 months.For Stephen, the challenge of data producing teams understanding downstream dependencies on their data is getting better but is still not fixed. But we shouldn't focus on just understanding who is using our data - it's the difference between producing data and producing data as a product. If you are producing data as a product, you should be actually interacting with your consumers so it's inherent that you should know who is consuming - and crucially, you know why they are consuming. But there is of course a cost to producing data as a product, lots of engineering time, especially if you don't have the tooling and capabilities, so that should be incentivized whether you are doing data mesh or not. The best incentive, the 'best carrot' rather than stick, for getting teams to really work on sharing their data as a product has been the insight flywheel from Stephen's point of view. The team sharing their data, if the consumers then generate many useful insights that are helpful back to that producing team, that's how data at an organization-wide scale hopefully works. The 1+1+1+1 = 10 kind of approach but it's also hard to make sure that will happen and that teams make sure to give information back to the producing team as appropriate. You want to try to foster an org culture of 'if another team wins from my data, that is a win for our team too' but easier said than done :DIn wrapping up, Stephen talked about how can you get a team that isn't data driven to start heading down that pathway. There was the data maturity journey mentioned earlier but this is about developing a passion in someone on that team for data and then letting their enthusiasm for - and results from - data get everyone else on board. You may have to deploy someone into that team but make sure there is someone in the team driving them to be more data driven because of passion instead of simply try to use the stick.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Mar 12, 2023 • 15min

Weekly Episode Summaries and Programming Notes – Week of March 12, 2023

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app