
Data Mesh Radio
Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh.
Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out!
Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing
Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above.
If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/
You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/
Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio
Latest episodes

Jul 28, 2023 • 22min
#243 Zhamak's Corner 26 - The Fundamental Data Need: Autonomy with Interconnectivity
Takeaways:It's important to understand that we need enablement to do data mesh well. That is enablement through technology and enablement through organizational approaches/behavior changes. Doing only one will likely not work."...they need to move fast, they cannot be bogged down by centralization of any kind, organization or technology." Scott note: I will say we discuss later the need for centrally provided enablers but central bottlenecks are the speed and flexibility killer, look to prevent and remove them where possible.People want to simply produce data as a normalized process of doing their job and make it consumable for the rest of the organization. How can we enable that? Why is it so tough? How do we make it interoperable - and more importantly interconnectable - too?Right now, the missing core component to do data mesh well is an easy ability to create and manage data products. Everyone is having to cobble things together and then trying to layer on the observability, the access control, the interconnectivity, etc. But it's built on a shaky foundation.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 24, 2023 • 1h 18min
#242 Making Data Accessible Makes Your Data Work Successful - More on PayPal's Data Mesh Journey - Interview w/ Kim Thies
Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Kim's LinkedIn: https://www.linkedin.com/in/vtkthies/Gemba Walk explanation #1: https://kanbantool.com/kanban-guide/gemba-walkGemba Walk explanation #2: https://safetyculture.com/topics/gemba-walk/PayPal Data Contract Template OSS: https://github.com/paypal/data-contract-template/tree/main/docsStart with why -- how great leaders inspire action | Simon Sinek | TEDxPugetSound: https://www.youtube.com/watch?v=u4ZoJKF_VuAIn this episode, Scott interviewed Kim Thies, at time of recording a Leader on the Enterprise Data Team at PayPal and now SVP, Client Innovation & Data Solutions at ProfitOptics. To be clear, she was only representing her own views on the episode.Some key takeaways/thoughts from Kim's point of view:When talking about data mesh to execs, it's helpful to go back to basics: "these are the four main principles, and this is what we've built and why." Scott note: I recommend you slightly alter the phrasing, especially around "Federated Computational Governance" ;)Look to Simon Sinek and "Start with the Why". Always investigate the why for the other party. What would be enticing to your business execs to lean in on data mesh? But data mesh for the sake of data mesh is not going to win over the business.The communication and relationship building aspects of data work are often overlooked and will serve you better than just about any architectural or technology decision. Build the relationships so your great data work will address actual business challenges and be leveraged!?Controversial?: Similarly, learn 'the art of the conversation' so you can extract from people their needs/wants and then see how you can help them meet those. What isn't going well where you can help is a great question to find new use cases where the stakeholder will be engaged - you're directly addressing a business challenge for them.It's probably better to start from needing a solution and data mesh is a good fit rather than using data mesh as a hammer and looking for nails. You shouldn't want to do - or at least shouldn't propose doing - data mesh for the sake of doing data mesh.If you have a clear business use case, it's much easier to get people engaged and keep them engaged/involved - around data mesh or data work in general. Look to a tangible benefit - e.g. cost savings is a pretty easy first use case to go after.?Controversial?: Data teams - especially data engineers - need to spend more time "experiencing" how customers/users actually use data to deliver on their business objectives. It will lead to better outcomes and better relationships with the business leaders/teams.Leverage the Gemba Walk philosophy: walk the 'factory floor' and talk to people far more often. Ask them how they get their work done. It doesn’t need to be overly formal, just collect information to help others do their job better with data.You don't get to rest once you've gotten initial approval. Execs' attention will not last, they will start to focus on other challenges. Keep pointing to the business challenges you are addressing - not the data work itself - to stay relevant and near top-of-mind.It's not as if every aspect of your business starts doing a data mesh approach when you start your journey. There will be compromises and other parts of the business will likely choose other approaches. That's okay and normal. Build your momentum and successes around data mesh but accept it won't be right for everyone, especially at first.Data mesh will be received differently by every person or at least every persona. Each of the pillars might resonate differently. So be ready when speaking to focus on the aspect that is getting them to lean in. You need to balance the four pillars in your journey but not every conversation :)There is probably far more transformation needed in your data practices than you expect. Even after hearing that, there is still probably far more than you expect. Processes and hearts + minds especially.In the current business environment - likely headed for recession - you might be able to get people bought in on data mesh simply for time-savings for the data team. In downturns, cost cutting becomes far more attractive.Once you get your mesh journey going and you have some interesting capabilities to offer, it's important to go out and find additional use cases. Even if you're proving a lot of value, people still probably need a little convincing or at least some additional understanding of what you're doing, they won't all come to you.To speak the same language as your business partners "you have to listen first." It's pretty easy to assume you mean the same thing but even foundational phrases like data product and data contract often have completely different meanings for people within the same organization.It's incredibly easy to overlook user experience in data. Don't fall into that trap! Scott note: we did a data user experience panel if you want to dig deeper, episode #190.?Controversial?: Domain ownership is probably the most important aspect of data mesh because it puts the data back in the hands of the people who really know it best. Context loss is such a prevalent problem in data and data mesh solves for that quite well.Kim started off with a bit about the PayPal journey to data mesh. They weren't looking to do data mesh, they had a specific business problem of disparate data sets across multiple domains that needed to be combined with decent governance and observability. It just so happened that data mesh was a great fit, so they went the data mesh route. And there actually was an engineering-led conversation of the engineering team looking for a use case to use data mesh but that didn't end up being the data mesh initiative that went to production. There was also a bit less scrutiny on a business-led journey because it was within a larger line of business/domain rather than it was the core data team strategy.One thing Kim believes is that data teams, especially data engineers, do not spend enough time really understanding and "experiencing" how customers use data. They should go and pair closer to the business teams to deliver better solutions. That will lead to better outcomes but also better relationships. This is that aspect of data UX that is often overlooked. People want something to make their jobs easier or make them perform better, not just data. So how does data actually intersect with that?To actually do that 'experiencing', Kim and team leveraged the Gemba Walk philosophy (see links above for a deeper dive). It’s a Japanese concept of going and 'walking the factory floor' to collect information. You don't need to do super formal interviews, meet people where they are and ask them questions about their day-to-day. Get a sense of what really matters and how they do work. If you just ask people what they want, they might give you the Henry Ford 'faster horse' answer versus you discovering their points of friction.Kim discussed the challenges of not just keeping but retaining buy-in and attention to your data mesh implementation - or really any data work. Once you get the approval, there are probably things that are more top of mind for your business partners, especially compared to the specific data work parts. So keep circling back to the business challenges you are addressing in conversations to stay near top of mind. Kim and team accomplished that by getting to a really rapid, tangible proof of concept. They quickly had something to show from the work that made it clear this could work at scale.Starting from entirely inside one business unit meant Kim and team had a lot of autonomy. As soon as they were brought more into the central data team, things changed and there was much more focus on communication and sharing - and compromising too - especially around technology choices and approaches. Many teams were taking different approaches to similar challenges with the same technology so how do you get to best practices?One thing that really stuck out to Kim about data mesh and driving buy-in is that different pillars resonate for different personas. Data scientists or even data engineers embedded into a domain often love the self-serve aspect. Not even for consumption but being able to actually self-serve their own data production needs. They aren't afraid of the ownership principle of data mesh because they can own their own timelines and not get stuck in centralized bottlenecks. They might not even realize they are struggling pre data mesh because the central bottleneck/friction is so ingrained to their way of working, you can really make things far better for them but they wouldn't have considered asking.Kim believes you should be prepared to market your mesh capabilities once you have them up and running. That doesn't have to be a sales-y approach but you want to go and find additional challenges people couldn't solve but now could with mesh to gain more momentum, converts, and funding. Learning the art of the conversation is crucial. Is there a way that you can advance their business goals, address their challenges? If yes, they are more likely to lean in and stay bought-in to the use case. Don't be afraid of a little sales and marketing tactics to get to better business outcomes for all.It's very easy to use the same term and mean something different. Just look at all the what is data mesh or what is a data product content out there. So Kim believes when talking to your internal partners, start from listening first so you understand how they are using different terms. It's easy to jump to solutioning instead of understanding but your solutions will be lacking without the understanding :)It's pretty easy to lose track of the customer experience in data mesh in Kim's view. There are so many things to focus on and honestly, UX design in data hasn't really been a thing. As an industry, we haven't talked about UX design for the platform or the data product much before data mesh started to force the conversation. If you have all the capabilities in the world but a bad user experience, are people really going to use what you've built?Circling back to communication, Kim talked about the challenges of communicating with execs and getting bogged down in the work done. She said it's important to start from baseline communication around data mesh: "these are the four main principles, and this is what we've built and why." If you start there, then people can really start to connect the work to what might be beneficial for them. And as she stated earlier, still look to start from the listening and understanding aspects first and then always start from the why. Scott note: see the YouTube link above.In wrapping up, Kim talked about the through-lines of the conversation. Establish and build relationships, not just data products. Learning the actual problems/challenges is key. Talk to people where they are before you try to build something for them. Understand what's actually going on and where their friction points actually are. And then also look for scalable business cases - the best way to discover those is by establishing the relationships :)Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 23, 2023 • 16min
Weekly Episode Summaries and Programming Notes – Week of July 23, 2023
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

10 snips
Jul 21, 2023 • 18min
#241 Data Product Success Metrics - A Kinda Deep Dive - Mesh Musings 51
Key summary points:At the start, it's more important to start measuring than it is to measure the right things. Do NOT let analysis paralysis hold you back.Similarly, your success metric measurement framework will probably suck to start. Oh well, get to measuring.Create a framework and tooling/platform capabilities - where necessary/useful - to make measuring and reporting against success metrics simple. That framework should be about defining the metrics and especially how to measure, not what success looks like for individual data products.Use fitness functionsGood metrics to consider in order of usefulness: user satisfaction, user value, data quality, time to business decision, delivery to expectations, time to update (can be squishy), and usagePlease Rate and Review us on your podcast app of choice!Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 17, 2023 • 60min
#240 Driving to Better Healthcare Patient Outcomes Through Data - Interview w/ Smriti Kirubanandan
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Smriti's LinkedIn: https://www.linkedin.com/in/smritikirubanandan/Smriti's HLTH Forward Podcast: https://hlthforward.buzzsprout.com/In this episode, Scott interviewed Smriti Kirubanandan, a Healthcare and Public Health Data Expert at a large consulting firm. To be clear, she was only representing her own views on the episode. Much of the challenges and opportunities discussed in this episode are more on the US side because of the not-so-well-functioning healthcare system there.Some key takeaways/thoughts from Smriti's point of view:In healthcare, it's easy to lose sight of the patient in the data - focusing solely on a condition, an area of the body, or a set of data instead of a person. It's vitally important to be focused on the data through a lens of treating the patient as an entire person.!Controversial!: It can sound time consuming to interact with data "in a much more intimate format" much like a 1:1 conversation but it's very important to drive to better outcomes. Instead of automated decisioning, we can point our tooling to compile the relevant information better to make decisions faster without removing the care or the person. Machines making automated decisions leads to worse patient outcomes."Obviously, privacy is important. Ethics is important. How do we interconnect this data and how do we get to communicate amongst" the payers and providers? So physicians can look at a much more complete picture of the patient to treat them better.There are many organizations collecting important health data about people. We need to rally around the patient outcomes instead of the financial outcomes and combine the data. Easier said than done of course.?Controversial?: Companies with important health data need to lean forward at the table, to buy-in on collaborating around sharing data or we will continue to have suboptimal patient outcomes.More organizations should make it possible to 'act local' relative to individual health. Instead of every decision being a very complex one, can we make things easier to simply make health progress if not 'fix' everything for someone's health. Basically, make it easier to make small decisions around more concrete and focused areas, much like a domain in data mesh.It's very important to empower people to leverage their own health data so we have to focus on getting them access and then giving them the power to do something with their data to drive better outcomes.There are 3 big issues we need to tackle simultaneously: 1) How do we give access to relevant and useful data to caregivers? 2) How do we ensure digital equity? And 3) how do we share data ethically?Think about interoperability - can I pull data from one system to integrate into my system - and interconnectivity - a more two-way interoperability/integration. We need to focus on interconnectivity far more.Especially in something as important and complex as healthcare, it's crucial for the data and engineering people to stay focused on target outcomes and not get lost in the code/work. A shared vision at the project and organization level are key.Many data projects go wrong because we still struggle with communication. Not that we aren't communicating but keeping all parties to data work aligned and in sync as learnings emerge is very hard. And data work needs to allow gray areas, which it often doesn't do well currently.Value-based care is a really important aspect of getting people the best care and data can help support that well. But it requires a lot of ethics and transparency to get there.?Controversial?: Digital twins of actual people could change healthcare in major ways. If done right, it could greatly improve the ability to treat patients because of the ability to test against negative health outcomes and find more optimal treatment plans.Smriti started out the conversation with a bit about her background and then jumped right into a key challenge in data around healthcare today: treating the patient as a person and not a set of data points and measurements. How do we look holistically at a person and focus on what would be best for their health AND life - the two are intrinsically linked. We still want to drive insights but personalized care, at least in the US, seems to be on the out and we can bring it back better with better data.Can we actually interact with healthcare data at the person level instead of at the billing code level? Smriti believes we can - that instead of letting the automated tooling make the calls on important health decisions - such as if key procedures or tests are authorized by insurance - we can use the tooling to allow a more "intimate" interaction with what the person is going through and how can we serve them best. That we can better leverage tools to make more humane decisions for folks.For Smriti, it's still a very tough challenge for how do we get data to interact across the various healthcare data silos, how do we smoothly exchange this data. And how do we tackle the governance of making the data interconnect? Right now, physicians cannot see a large amount of crucial patient data; but is that on patients to connect the data between offices and facilities so doctors have a more complete view? How do we maintain privacy if we are sharing information across systems? What about ethics, do we really want to give a lot of these companies very intimate health data? Scott note: see the recent acquisition of One Medical by Amazon - they are now supposedly requiring patients to waive their HIPPA rights to get care.Because there are so many challenges around integrating healthcare data across so many systems/silos, Smriti believes that one company itself can't make that big of an impact to the overall system. BUT each company being better about doing their part can help achieve a data-driven aspect to healthcare that leads to better patient outcomes. There needs to be more of a concerted effort to collaborate in the right ways.In Smriti's view, it's very important to empower people to make better health decisions for themselves driven by data. That means giving them access to more of their data, giving them the capabilities to leverage that data to make decisions, and then empowering them to actually act on those decisions. There are some pretty basic things we can do to improve the health of our fellow citizens and it's on a number of people to keep the pressure on to move forward on that. Not one entity can do it alone but we should all be pressing for better solutions.Smriti talked about 3 big challenges to sharing our data and driving better patient outcomes. The first is how do we actually get data in front of our care-givers? How can we empower the individual to share that data and how can the physicians or other care-givers access it and drive better patient decisions? The second is how do we ensure digital equity? Many people don't have access to good internet. Many are not digitally literate enough to actually participate in data sharing. How do we empower them to participate in better health outcomes? And the final challenge is how do we actually share this data ethically and with empathy? All of these are being worked on but it takes a very large cross-org contingent to move things forward. Everyone can play a part but it will take a lot of collective work.Interoperability versus interconnectivity is something Smriti is passionate about because interoperability doesn't really ensure that two systems can share information all that well - it might be that your data is in a proper format but your definition is way different than mine. Interconnectivity is about a two-way collaboration and easy integration between systems around data in her definition. That interconnectivity is necessary to really supercharge our health data revolution :)Smriti talked about a challenge in data that many past guests have touched on - how do you keep people focused on target outcomes instead of the minutiae of the work, keeping them from getting lost in the code instead of what you are trying to achieve. It's key to have a shared vision about what is the goal and why are you doing the work. And if people lose sight of that, you need to bring them out of the weeds or you'll get very interesting solutions that don't solve actual important problem.When asked why data work seems to not net expected results so often - the 80%+ of analytics initiatives don't meet expectations statistic - Smriti pointed to difficulties in communication. Not that we aren't communicating but the challenges around how do we quickly iterate together and share small-scale incremental learnings so reality and expectations are not constantly drifting apart. Basically, communication is hard and we need to place more focus on getting it right - and having a wider tolerance range initially of what 'right means - but there's no silver bullet. We need to be able to be vulnerable with each other and operate in gray areas :)Smriti really believes in the concept of value-based care. But to get there, we need transparency around price and care. Individuals need to have access to their information but also need to equip themselves with the knowledge of how to leverage their data to get better care. It isn't all on the care workers. Digital twins in healthcare is something Smriti is really excited about. A digital twin of a person gives providers an ability to potentially test reactions to different treatment protocols, optimizing positive outcomes and hopefully minimizing negative outcomes. Physicians can test a number of treatments simultaneously without experimenting on the patients themselves :) Healthcare digital twins are in their very early stages but she is quite excited about the possibilities. Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 16, 2023 • 16min
Weekly Episode Summaries and Programming Notes – Week of July 16, 2023
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

5 snips
Jul 14, 2023 • 1h 4min
#239 Panel: The Role of Data Product Management in Data Mesh - Led by Frannie Helforoush w/ Alla Hale and Jill Maffeo
Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Frannie's LinkedIn: https://www.linkedin.com/in/frannie-farnaz-h-a7a11014/Jill's LinkedIn: https://www.linkedin.com/in/jillianmaffeo/Alla's LinkedIn: https://www.linkedin.com/in/allahale/In this episode, guest host Frannie Helforoush, Technical Product Manager/Data Product Manager at RBC Global Asset Management (guest of episode #230) facilitated a discussion with Alla Hale, Senior Data Product Manager - Digital Capabilities at Ecolab (guest of episode #122), and Jill Maffeo, Senior Data Product Manager at Vista (guest of episode #151). As per usual, all guests were only reflecting their own views.The topic for this panel was broadly data product management and the role of the data product manager in a data mesh implementation. Data Product Manager is still a very nascent role so there is still a lot of confusion around it :) If I were to sum up the feeling of the conversation very succinctly, it would be: it's early days, have patience.Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views individually.Scott's Top Takeaways:The role of the data product manager is pretty wide-ranging. It's easy to get overwhelmed and not focus on what really matters. We have to be patient as we learn best practices around data product management because it's still a nascent space.It's crucial to focus on who the user of a data product will be - whether that is an intermediate user creating a report for another or someone directly consuming from the data product. It's very easy to lose sight of what is the exact use case and how will people use the data product - to consume information. Data for the sake of data is just expensive 1s and 0s.There isn't even a common understanding of what a data product is, no standard definition. So data product management is even tougher to define in many ways. It is still a very emerging practice so everyone is still figuring it out. It's okay if things are a bit messy and muddled, they are for everyone else too.Because there isn't _really_ a tangible UI (UX is also a bit muddled) to a data product, it's really hard to get a good understanding of the boundaries of a data product. People don't have an experience with data products like they do with most types of products, digital or physical, so you have to have some patience as people figure it out.There are a ton of learnings we can bring from physical and digital/software product management to data products. Some things only need small tweaks to work well. But be prepared for lots of trial and error so make more room for experimentation than you would in software.What is a data product team - or is there a data product team - is a question every organization has to ask. And the answer for each org probably evolves too. At first, as you are figuring out how to build and manage data products and your platform is immature, you probably have a team specific to data products. But eventually, for many data products/domains, there will likely simply be a data product developer that is part of the product team or data product development will be among the general team's developer duties and product management gets a bit more fuzzy.Product marketing is a relatively foreign concept in data products. But it seems there needs to be far more interaction with existing and potential users of a data product to add value to existing use cases and create new use cases. As of now, that responsibility probably falls on the data product manager unfortunately.The best path to developing a very valuable data product is 'consuming stakeholder' engagement. If the consumer isn't engaged, if they aren't giving the necessary information to really develop the data product to solve a use case, consumption seems to be below expectations across every org I'm seeing.Other Important Takeaways (many touch on similar points from different aspects):User Experience (UX) is such a crucial part of essentially all product management - how does that play in data products? Is it on the platform team to create the UX and the data products just plug into that UX? Is that just the UI? How do documentation, number and types of APIs, interoperability/interconnectivity to other data products, etc. factor in?We need more adoption of forward-thinking/leaning product management practices in data. Instead of only being reactionary to requests, how are we going to extract what people want next?Data products are so new and so many people have preconceived notions of what the phrase should mean or does mean. Be prepared for lots of work on alignment just around the definition of data product.Be prepared for friction when introducing good software product management practices to data. Many data people aren't used to really interacting with product management/managers so they will require some time and hand-holding.Getting buy-in for the budget to put the proper amount of data product management/managers in place might be a little difficult. Especially when you start to ask where that budget comes from. Make sure to have these conversations early in your journey and adjust as needs arise.Similarly, a common issue in software product management is a PM is too overloaded to talk to customers or potential customers. That's even more likely to happen in data product management if you aren't careful. Really focus on making sure there is enough time for product discovery or you will be doing development by request only.Defining data product success metrics - broadly and then specifically for each data product - is going to be a struggle. Usage is an indicator but not comparable across data products for instance. That doesn't even get into the challenge of then measuring the success metrics. But you just have to start and starting with mediocre metrics is better than not starting at all.How the heck do we think about A/B testing relative to data and data products? Do we have enough consumers to test? What does it even mean to A/B test data?Similarly, experimentation is key to doing data better. But can we experiment around the data itself? Or are we just using data to power and measure experimentation? Maybe experimentation around interfaces to data?Data products have a lifecycle. If you are unsure of the value of a data product to users and you can't really get them to articulate it, sometimes shutting down the data product - at least temporarily to do the 'did anyone scream' test - is going to be the right call.It's crucial to balance long-term product sustainability with time to market. As you learn more about building data products, many organizations are seeing decreasing time to market of new data products but especially at the start of a data mesh journey, it is really easy to create non-sustainable/maintainable products to match requests instead of needs in a productized way.We can create data products that answer questions that 'should' be asked, but should we? If we are sharing insights that are not ready to be leveraged or are not what the stakeholders care about, what will adoption be like? Thus far, it seems like most data mesh implementers are saying adoption is below expectations for proactive data products not created to specific stakeholder-defined use cases. Hopefully that changes as we increase data literacy but important to consider.It's crucial to consider interoperability and cross-domain usage/queries as part of your potential user flow but it's also very hard to try map ahead when you aren't sure of those use cases. We can easily lock ourselves in and reduce flexibility, which then might become like a data warehouse, just micro-warehouses everywhere…It will be interesting to see where data product managers come from - if they come from traditional product management, they have to learn a lot of the hidden nuances of data. But coming from data, it's easy to get too tied into the data work instead of what is the data product supposed to do.Figuring out where data product management reports will be important for organizations. It could be into the data team, it could be into product org, into the domain, or maybe even the CTO. There are definitely puts and takes to each. Personally, I think they should start reporting into the data org and eventually move to being part of the product org or reporting into the domain itself because data ownership and data products are just a part of a domain's mandate.Frannie asked a great question: should the data product manager own the data product strategy? And what even is a data product strategy? I feel like the answer is yes but that data product strategies will be very immature as people learn how to build and manage data products. Who should own the overall strategy across all data products? It all starts from the user and use case. To build a good data product, you need to understand how people are going to use it. But then, are we building one-off instead of universally usable data products? I think you start from a use case and build out, expand out but maintain as much flexibility as possible.Product discovery - the process of finding out what products you should build and what are the useful features to add to those - is really important in product management. But it's far more challenging in data because everyone always says yes to 'do you want more data?' How can you determine what will be used and valued? Where is a real use case versus a 'that would be nice to have'?Data product management should extend into what data do we want, what incremental data should we generate, not only what data do we have. As Stephen Galsworthy talked about, especially with hardware, you often don't get to change what info you are collecting. Data sourcing is going to be part of data product management.Maybe good data product marketing is just being highly available and willing to explain - having office hours and show and tells. If trying to generate new uses cases from unengaged potential users isn't driving great value, maybe it's all about inbound marketing, not outbound.Much of the value in data mesh is cross domain use cases and that value is captured by combining information from different data products. But it seems we are still early days in figuring out how to communicate some of the potential questions out there to design good data products to answer them. Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 10, 2023 • 1h 10min
#238 Bringing Software Testing Best Practices to Data - Interview w/ Sofia Tania
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Tania's LinkedIn: https://www.linkedin.com/in/sofia-tania/Presentation: "Data Mesh testing: An opinionated view of what good looks like": https://www.youtube.com/watch?v=stNZQESndAAIn this episode, Scott interviewed Sofia Tania (she goes by Tania), Tech Principal at Thoughtworks. To be clear, she was only representing her own views on the episode. Scott asked her to be on especially because of a presentation she did on applying testing - especially important for data contracts - in data mesh.Scott note: I was apparently getting extremely sick throughout this call so if I ramble a bit, I apologize. Tania's dog also _really_ wanted to be part of the conversation so you might hear us both chuckling a bit about her antics. And Tania has some really great insights so I probably asked her probably the hardest questions of any guest to date. She did a great job answering them though! A lot of the takeaways are about are we actually ready to do a lot of the necessary testing to ensure quality around data, which I don't think has a clear answer yet :)Some key takeaways/thoughts from Tania's point of view:We have to bring software best practices to data but we should do it smartly and not make the same mistakes we made in software, let's start from a leveled up position. Zhamak has said the same. The question becomes how but looking at how practices evolved in software should bring us a lot of learnings.Just pushing ownership of data to the domains won't suddenly solve data quality challenges. The new owners - the domains - have to really understand what ownership means and what quality means for use cases leveraging their data.A reasonably good way to measure if your data product is 'good enough' regarding data quality is to look at your SLOs (service level objectives) and SLIs (service level indicators). If you are constantly hitting the SLIs, you can probably focus more on new features. If not, you need to improve your quality.!Controversial!: Consider almost a zero-trust approach to testing for data. Test as data is flowing through the systems, as data lands. Test in development as to what changes might impact data. And then consumers should be writing tests against source data to prevent issues. Scott note: that's a lot of tests but how important is certain data to your org?In a decentralized ownership model, many data consumers are less likely to trust data - at least at first - so you need to show them why they can trust at the data product level. Leveraging proper testing and data contract strategies is crucial to being able to prove out data quality.You should look to build out a robust testing and observability framework as part of your platform. Data product owners/domains shouldn't have to build it out manually themselves.If you only have detection of data quality issues once data hits production, once something or someone is potentially already using it, that's an issue. Look to create ways to test data at the data product development stage, as part of the CI/CD. We can't rely only on lagging quality indicators if we want to up our data quality game.Data for analytics and AI is even more complicated than on the operational plane. Generally, data hasn't been transformed multiple times on the operational plane so if there is an issue, it's either in the source application or an issue with the API call that was made. In data, we have to develop smarter tests as data flows through your pipelines.Data producers need to define quality data in terms of what consumers actually want/need. Instead of arbitrarily setting quality levels, what do the consumers want?Consumer-driven testing in data sounds wonderful. But it's hard to see teams being willing to do it :) We need better tooling and ways of working to make this easier.Data quality surveys of data consumers are important for a number of reasons but are lagging indicators. They should be used to help develop appropriate SLAs/SLIs for data products and monitor if data products are generally meeting customer needs.?Controversial?: Can a data producer really develop a custom test for their data product for each consumer or do the consumer owes it to the producer to develop tests to ensure the data product continues to serve their use case well? Scott note: this could start a LinkedIn war but it's an important question to ask!If you push for consumer driven testing, don't be surprised at a lot of pushback. That happens still even in the API world where it's been more accepted for years :)Are consumers ready and able to programmatically define what good data quality means for them for each use case? There are some tools that can help but practices and tooling are still mostly nascent.?Controversial?: Many consumers still have the 'give me all you have and I'll sort through it' mindset. Trying to get them to lock into what they are consuming will be hard.There can be a real chicken and egg scenario around data products, especially testing. Consumers don't know precisely what they want and what will best suit them until they see the data/data product. But building out a data product and having to change it a lot to customer feedback is also tough - producers want to build it once instead of 10 iterations. Just be prepared for this to be an ongoing issue in data and for it to lengthen times to data product release.?Controversial?: Having your transformations handled by low-code/no-code solutions can easily hurt you more than it will help you. Be very wary. Scott note: this is coming up in A LOT of conversations recently and was featured in the Thoughtworks radar released in early MayIn software development - including development for data - abstractions are crucial but can get you in trouble. Really think deep about your abstractions because it's easy to lose sight of what underlies the abstraction. And abstractions of abstractions of abstractions just compounds the issue :)Tania started with a bit of her background, especially related to data mesh. She worked on one of the clients that was an inspiration for Zhamak's original data mesh blog post and spent 2+ years as the lead on the technical side of a data mesh implementation at another client. Her background as a developer and tech generalist have shaped her thoughts around bringing good software practices to data.For Tania, the reason she originally put together the presentation on software testing practices in data was a client question (paraphrased): 'if we already have data quality issues in the centralized setup with clear ownership and people who really know data, how the heck are we going to _improve_ data quality by pushing data ownership into the domains?' It's a very fair question - just pushing ownership without the capability and buy-in to own the data is possibly (likely?) going to lead to worse quality. So we need tests that work and can be shown to consumers to help ensure quality and trust in that quality. Showing people your kind of data quality certification goes a long way towards trust.In Tania's view, much of the existing data observability tooling and practices, while valuable, only really alert when there is already a problem that's hit production. Is there a way where we can shift testing left, not just in ownership but testing earlier in the flow of data? Earlier in the development timeline of a data product? So that is 3 potential ways to shift left, to test earlier. Think about detect versus protect - can we prevent data quality issues instead of only better identify and resolve them?Tania talked about how data product producers need to start to shift their thinking around data quality. What specifically do my consumers want - and why? Quality is inherently subjective so extract from them what their needs are and look to serve those. And we should look to stop using _only_ lagging quality indicators like surveys. They are valuable in reshaping what SLAs should be and is a data product meeting needs and expectations but they are certainly not designed to quickly detect issues. But do consumers actually know what would make data 'high quality' for them?Consumer-driven data quality testing is a good idea for many reasons in Tania's view. When we think about a single data product having 5 known, regular data consumers, does the data producer need to develop 5 different sets of tests to specifically protect against breaking changes or issues specific to each use case? Do they have to define quality metrics differently for each use case of the same data product? Do they have to be so familiar with each use case that they evolve their tests as use cases evolve? How much can we reasonably ask the data product consumers to do in the testing space to ensure quality?But Tania admitted she hasn't led a client in doing consumer-driven testing for data. It's really hard to get data testing right in general, are people really ready for doing consumer-driven data testing? We don't really have the tooling or the general best practices to do it well yet. And there is also just philosophical pushback - being forced to programmatically say what good quality means instead of saying 'the data quality isn't good enough' is a tough pill to swallow for consumers. Do consumers really know precisely what they want? Things like the tools Great Expectations or Soda Core are a good start here but we need more. And many consumers are still in the 'give me all the data you have' kind of mindset so reducing the possible scope of data they get is not an easy mindset shift.Tania also pointed to a persistent challenge in data that is a chicken and egg problem: data producers can't build exactly what consumers want until they get feedback from the consumers. But the consumers don't know exactly what they want until they've seen an early iteration of the data product. So you have lengthening time between conception to release because both sides need more from the other to move forward but can't until they get at least some information. A good way to press on consumers might be to ask them about bad-case scenarios - what has to be there and why? That will _possibly_ prevent kitchen sink feature requests.As the conversation transitioned into low-code/no-code tooling, Tania lamented about the difference between ease of use and simplicity. While low-code/no-code tools can be very easy to use at the start, as scale/complexity of use cases increases, they often become extremely difficult to manage. They are focused on ease-of-use, their architecture isn't about maintaining simplicity of managing the solution as it scales. As you add more and more views, you might actually have 30-40 joins across many data products and performance comes to a halt. This was also mentioned in the Thoughtworks Radar that was released in early May 2023 (Tania contributed to that).In wrapping up, Tania shared what she believes is a good way to measure if you are doing well enough with your data product, especially in regards to data quality. Look at your SLOs (service level objectives) and SLIs (service level indicators) - are you hitting those regularly? Then maybe you can focus more on new feature development. But if not, you might need more/better monitoring/observability. Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 9, 2023 • 31min
Weekly Episode Summaries and Programming Notes – Week of July 9, 2023
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 7, 2023 • 21min
#237 Zhamak's Corner 25 - We Don't Have to Jerk the Wheel - Making Smaller Correction Decisions to Get to Our Data Destination
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Takeaways:We should be thinking about how we can get out of the batch mode into the streaming mode. Yes, technologically but also think about How can we get to making decisions based on smaller amounts of data more frequently - both automated systems like AI but also for our people. Instead of making adjustments or decisions based on big batches of data, we can make smaller course corrections."Data mesh is about building responsibility into data and the quality of the data you share and being explicit about that quality."Make the cost of mistakes that much smaller by creating smaller decisions that add up to the bigger decisions - it's not one giant leap, it's many steps that can avoid more hazards as you come across them."Make decisions at the speed of the market" is crucial to being nimble, being able to react to opportunities or new challenges. To do that, we need to put data in the hands of those closest to the market, the domains.Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf