
Data Mesh Radio
Interviews with data mesh practitioners, deep dives/how-tos, anti-patterns, panels, chats (not debates) with skeptics, "mesh musings", and so much more. Host Scott Hirleman (founder of the Data Mesh Learning Community) shares his learnings - and those of the broader data community - from over a year of deep diving into data mesh.
Each episode contains a BLUF - bottom line, up front - so you can quickly absorb a few key takeaways and also decide if an episode will be useful to you - nothing worse than listening for 20+ minutes before figuring out if a podcast episode is going to be interesting and/or incremental ;) Hoping to provide quality transcripts in the future - if you want to help, please reach out!
Data Mesh Radio is also looking for guests to share their experience with data mesh! Even if that experience is 'I am confused, let's chat about' some specific topic. Yes, that could be you! You can check out our guest and feedback FAQ, including how to submit your name to be a guest and how to submit feedback - including anonymously if you want - here: https://docs.google.com/document/d/1dDdb1mEhmcYqx3xYAvPuM1FZMuGiCszyY9x8X250KuQ/edit?usp=sharing
Data Mesh Radio is committed to diversity and inclusion. This includes in our guests and guest hosts. If you are part of a minoritized group, please see this as an open invitation to being a guest, so please hit the link above.
If you are looking for additional useful information on data mesh, we recommend the community resources from Data Mesh Learning. All are vendor independent. https://datameshlearning.com/community/
You should also follow Zhamak Dehghani (founder of the data mesh concept); she posts a lot of great things on LinkedIn and has a wonderful data mesh book through O'Reilly. Plus, she's just a nice person: https://www.linkedin.com/in/zhamak-dehghani/detail/recent-activity/shares/
Data Mesh Radio is provided as a free community resource by DataStax. If you need a database that is easy to scale - read: serverless - but also easy to develop for - many APIs including gRPC, REST, JSON, GraphQL, etc. all of which are OSS under the Stargate project - check out DataStax's AstraDB service :) Built on Apache Cassandra, AstraDB is very performant and oh yeah, is also multi-region/multi-cloud so you can focus on scaling your company, not your database. There's a free forever tier for poking around/home projects and you can also use code DAAP500 for a $500 free credit (apply under payment options): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio
Latest episodes

Jul 12, 2022 • 35min
#100 A Lookback at What We've Learned So Far - Mesh Musings 22
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 11, 2022 • 1h 9min
#99 Getting Philosophical About Knowledge and Sharing Experiences via Data - Interview w/ Andrew Padilla
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Andrew's LinkedIn: https://www.linkedin.com/in/andrew-padilla-8988094a/Datacequia website: https://www.datacequia.com/Andrew's personal Substack: https://datacequia.substack.com/Data Mesh Community newsletter Substack: https://datameshlearning.substack.com/In this episode, Scott interviewed Andrew Padilla, who runs a data and software consulting company - Datacequia - and serves as editor of the Data Mesh Learning community newsletter.This one is a bit more philosophical about sharing information/knowledge so it's one to sit and think over. Things in quotes are direct from Andrew.Some key takeaways/thoughts that come from Andrew's view of data mesh and the data space in general:To move from sharing the 1s and 0s of data to actually sharing knowledge, we need to harmonize data, metadata, and code - "the digital embodiment of knowledge". That's where Andrew hopes the mesh data products can head.Software development isn't cutting it for sharing knowledge. Will data product development? Do we need to move to knowledge-centered development instead? Remains to be seen.We still don't know how to model well - in data - what is going on in the real world. What are the experiences of the organization? Can we really define an "organizational experience"? Event storming tries but seems to fall short often.We must learn to treat organizations like living entities. Organizational experiences cross multiple domains and the types of experiences will change, will evolve - possibly quite quickly. We again have to get better at modeling those and evolving how we share knowledge about the experiences.Knowledge graphs are the best way we have currently for combining information across domains. We still haven't fully figured out how to leverage our cross domain knowledge though.Historically, we've bent our ways of working to the limitations of the machines. We need to spend more time on bending the machines to better match the way humans store, process, and share knowledge.Data centricity is an interesting concept but might take our current imbalance of data versus operational focus too far towards data. But that might be what is necessary to really get to balance. It remains to be seen. But it's crucial to understand a data-first focus isn't necessarily a knowledge-first or knowledge as a first class citizen approach.It's important to understand that mesh data products are a means to an end in data mesh. Yes, they are crucial to sharing information but they are there to serve a purpose, not that they are the purpose.In data mesh, it can be easy to focus too much on creating data products of immediate utility or that are high value in and of themselves. But it's important to think about how data products together create value - and maybe not immediate value - to really drive forward our understanding of the organization's knowledge and experiences.Andrew started the conversation with his hope and vision for data products - or the data quantum - in data mesh. Historically, data, metadata, and code are not often grouped together and even less frequently are they in harmony. They belong together as that harmony creates a higher level abstraction to share knowledge, not just the 1s and 0s of data. To get data mesh right, Andrew believes you have to really figure out how to build mesh data products with that harmonization in mind. And you probably won't get it right at the start of your journey and that's okay - we are all still figuring it out.We possibly - or even probably - need to move from software development and even data product development to knowledge development in Andrew's view. Knowledge development meaning centering the development process on sharing knowledge. So much of what data we share lacks the actual context of what happened in the real world, the experiences of the organization. But we still don't have a great way of sharing those organizational experiences - event storming in DDD (Domain Driven Design) tries to address this but it often falls short. How can we progress towards modeling organizational experiences?For Andrew, as many have said, if we can figure out how to do data mesh well and create some good standards, it is very well-suited for cross organizational knowledge sharing and collaboration. Not data selling but collaboration - similar to what Jarkko Moilanen mentioned in his episode's discussion about the data economy. But it's still early and data mesh will not be the silver bullet to figuring out how to do that cross organizational collaboration well - and crucially safely and compliantly.We are just starting to understand, to develop a point of view on what Andrew called the "knowledge of experience". Per Andrew, "knowledge, by definition, is just the acquisition and use of experience and/or education". We must learn to treat organizations like living entities -> organizations have new experiences and are changed by them. And the types of experiences are also changing as the real world changes. In the 80s and 90s, much of communication between entities was done by fax. Faxes aren't nearly as common for most organizations nowadays. But the evolution of experiences seems to be accelerating and we are still struggling to capture those in data.For Andrew, software and data must reflect or "be the embodiment of" those organizational experiences. Do we even know what it really means for an organization to have an experience? Much less model it? And those experiences don't take place in isolation to a specific domain. So once we figure out the experience modeling for the domain, we then get to try to figure out how to scale that for experiences across domains. Yes, we have some work ahead of us!Knowledge Graphs probably hold the key to the cross domain information sharing, per Andrew. He called them the "glue". They are as good as we have technologically currently for developing and leveraging the knowledge tie-ins across domains. They allow people to bring more of a "history of experiences" to the conversation. In Andrew's view, we've historically had to deal with the limitations of what technology could do when thinking about knowledge sharing. Our ways of working have bent to those limitations. But we should start to try to bend the machines more to the way humans store, process, and share knowledge. Almost a higher-order Law similar to Conway's Law that we must push the machines to communicate and work in the way humans think.Scott asked about Dave McComb's data centricity concept. Andrew thinks that data centricity might swing too far towards data only instead of maybe knowledge first or knowledge as an equal footing to the 1s and 0s of software. It might shake things up to move towards data centricity but does it get us closer to knowledge as a first-class citizen? Per a follow-up, Andrew sees the data, metadata, and logic as components that must work together in their specified functions in harmony and balance much like your vital organs. So just swinging to data doesn't necessarily swing us towards knowledge. Andrew brought up the concept of the 2D person in physics. What might be just a simple line in three dimensions - something you can easily avoid - is a full stop blocker to the 2D person. 2D is choosing between is this data or is this code. We are living in a 3D world - even 4D when you think of time - so we need to move past our current ways of working and think on a higher plane about how to approach our work.Data products in data mesh are simply a means to an end for Andrew. They are the building blocks to build out your knowledge repository and sharing. But it's crucial to understand that they are for a purpose in the greater organizational sense, not the end accomplishment.Andrew finished up by sharing his ideas around data monetization. If you are specifically thinking of data, even for internal sharing, as a monetary asset, does that put experimentation on the back burner? Do you only go for the sure bets? Or only focus on things you know have a specific use instead of simply potentially valuable? Andrew thinks that there is significant value to R&D incubation in general and specifically to data. We aren't in a world yet where producing those speculative or interesting with no specific utility data products is cheap so maybe those come later. But it's important to not become overly focused on the immediate utility of each data product itself instead of how it fits into the greater picture of your organization's knowledge. Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 10, 2022 • 29min
Weekly Episode Summaries and Programming Notes - Week of July 10, 2022 - Data Mesh Radio
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 8, 2022 • 1h 10min
#98 How to Nail Your Data Mesh Vendor Assessment: A Journey Story - Interview w/ Jen Tedrow
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Jen's LinkedIn: https://www.linkedin.com/in/jentedrow/In this episode, Scott interviewed Jen Tedrow, a Product Management Consultant at Pathfinder Product Labs who is currently working with a large client on a data mesh implementation. She was only representing her own perspective in this episode.Some key takeaways/thoughts from Jen's point of view from the conversation:A data mesh vendor assessment is likely to be different than almost any other vendor assessment you've done before, especially if you aren't evolving something existing. There is so much more to cover and the overall platform needs to meet your needs, integrate with how you handle data on the application side including integrations, comply with your governance standards, fit within budget, etc. That's a lot of needles to thread.Spend considerably more time doing the discovery process in your data mesh vendor assessment than you would for a normal vendor assessment. There are a lot of potentially hidden needs / wants and it is far better to surface them early.By digging deep into stakeholders' desired outcomes, you can understand what you need to deliver but also, you can get insight into driving buy-in. Address the challenges preventing them from the desired outcomes and they will feel seen and heard.As many guests have said, lead with empathy. Change is painful. But if you are realistic with people and make them feel seen and heard, it will be much less painful.When speaking with potential users, again really spend the time to make them feel seen and heard - reflect back to them what you heard. And have them share what is their ideal state. You may not be able to fully deliver on it but it's important to understand where they want to go.As you are learning new information, share that in a continuous stream with stakeholders so they understand the recommendations you are making along the way and at the "end" of the assessment - it doesn't really end when you finish the assessment, so "end".Be prepared for there to be capability gaps - possibly significant - between what you want now and what is available in the market or that you are able to build in your budget. There are just a number of capabilities that aren't really part of any vendor offering at the moment. Waiting until everything is perfect will mean you are waiting for a while!It's crucial to focus on what do you value most right now and also in the future when making vendor assessments. There are so many nice-to-haves with a data mesh implementation but you'll need to compromise. Jen and team developed a good framework for evaluating offerings as to whether they fit actual needs or just wants.The three most important aspects to meet right now for Jen and team through the self-serve platform were user experience/low barrier to usage, automation, and ability to easily integrate the tools together and with the existing stack.When figuring out what capabilities you need from your platform, create high-level task-based use cases, not systems requirements. This will prevent you from steering too much towards trying to serve any one specific use or getting bogged down in tech compared to capabilities.Look to past instances of failed or underwhelming implementations of tools and processes in your organization to find the common ways implementations fail in said organization. And then work to avoid those :DIt is crucial to make sure everyone understands a data mesh implementation is an iterative process. You will continue to listen to feedback and evaluate how things are working and then make further improvements - it's a journey!It will likely be quite tough to move forward in a data mesh implementation if you don't align your data strategy and work with your business partners to create a mutually beneficial target outcome. And carve out time for teams to actually be able to deliver data products and participate in your data mesh implementation - they need time to do the actual work. Make it a priority.It's very important to provide an easy way for teams to start participating in your data mesh implementation/journey. Just asking a team to participate won't cut it. Make it low friction, make it beneficial to them. Easier said than done but still very important.Jen has done a number of vendor assessments in the past. But this one was a doozy. There isn't a ton of information yet about how to do data mesh really well - especially the platform side - so it is difficult to assess exactly what capabilities you need. There are still a number of gaps in vendor offerings when you do know what capabilities you require to meet so that is more difficult. Then add in that you are likely bringing on multiple new vendors at once and making sure they play nicely together and with your existing technology stack. And then there is budget... So, as mentioned, it was a doozy.Overall, Jen's role was trying to account for specifically 4 different capabilities: data discoverability, provisioning, observability and quality, and access control.For Jen, in most vendor assessments, there is a much tighter scope around what you are trying to assess. E.g. looking for a streaming technology or looking for an integration provider. But with a data mesh platform, there are so so many moving pieces, it was a very unique and difficult challenge to find a good harmonious match to cover as much of the needs for now and to work with the vendors to grow together to cover future needs. Jen used a framework she's used for her vendor assessments historically of Discover -> Align -> Assess. The goal is to answer what is important now and where do we need to go - and then does a certain offering help us address both of those points. But the big difference with this assessment was how much more time was spent on discovery.Discovery in general can be challenging for a few reasons, per Jen. One is that people want to move quickly - make the decision and plow forward. But that can often lead to not picking a good direction and creating hard to pay down tech debt. So investing the time to understand all your needs/wants is crucial. Digging deep into stakeholders' desired outcomes has a few benefits: 1) you know what you need to deliver and 2) you know how to drive buy-in - by addressing the challenges preventing them from their desired outcomes. Jen stressed, as many many past guests have, the importance of leading with empathy when working on anything data mesh related. Change is on the horizon for everyone and change is painful. When leading with empathy, stakeholders and users alike have been very willing to share with Jen their challenges and where they want to go in the future with data."Why are we spending this time together?" is an important question to answer, per Jen. The people you interview will be more willing to openly share if you spend your time listening rather than selling. Make them feel seen and heard, spend the time to reflect back what someone said - let them know you weren't just hearing but listening. And talk with them about what is the current state and what is their desired state. And of course let them know that implementing data mesh isn't a threat to their jobs.Per Jen, you may not be able to give everyone their desired state, especially in the initial implementation phase, but understanding why they want that certain capabilities might make it easier to deliver something of value to them if not the ideal. And then spend time with the stakeholders to constantly share what you are learning so it isn't a sudden recommendation at the end of an evaluation phase - that constant feedback lets people know why you are making the decisions you are.Right now, pretty much no matter what capabilities you are looking for in your data mesh platform as part of a vendor assessment, expect gaps per Jen. Zhamak has mentioned this frequently too. It is important to evaluate what is necessary and what is nice to have now and then what will be necessary down the road. Will the vendors or offerings you are looking at be able to grow into those gaps over time to meet future needs too?One aspect that made this assessment so different for Jen was that vendor assessments are typically looking for a single or logically bundled capability and doing a vendor bakeoff. It is based on the "known knowns". But for data mesh right now, there had to be so much more discovery work. It was crucial to focus on what stakeholders really value. For Jen and team, right now, that was on lowering the barriers and friction to usage so the UX (user experience) was pretty crucial. As was being able to stitch solutions together without a ton of custom work.For Jen, a few requirements really came to the surface as crucial. Again, that user experience was one. But automation was another. How could they make creating and managing a data product an easy transition - or at least as easy as possible?What worked well for Jen and team to really understand what capabilities were actually crucial was focusing very much on task-based high level use cases without involving any necessary systems requirements. It meant they could focus much more on what needed to get accomplished instead of specific examples that had more custom needs. Doing this created a very clear picture of what they actually needed. When thinking about how to adapt data mesh to your organization, Jen recommends looking at what has worked and even more closely at what hasn't worked in past tool and process implementations specifically in your organization. What caused failures so you can look to avoid going down the same path. There are so many potential areas of friction in a data mesh implementation, do your diligence to find common failure patterns to your organization to avoid them.Jen talked about how a successful data mesh implementation will really be about the intersection of people, process, and technology. You need to be including good change management principles into everything you touch. And make sure people understand that this will be iterative. It won't be perfect from day one but you are going to be listening and improving along the way. To drive momentum, Jen recommends highlighting - loudly and often - early adopter successes. It shows that you are adding value but also rewards your early adopters - that hopefully spurs others forward to move forward in their data mesh participation too. And be honest with everyone that doing something like data mesh will involve change - and change is painful.Jen and Scott discussed what use cases to look for in your early journey. Jen recommends balancing three factors: 1) what will be impactful, 2) what is possible, and 3) who is willing to partner with you. It's important to show those early successes to keep the funding coming as data mesh is not a single upfront cost - it requires continuous investment.Jen recognizes how lucky she is to have a leader that is sharing their vision widely, driving buy-in and aligning strategy on data mesh with business partners to make it successful for all parties. A big crucial aspect is that teams have enough time carved out to actually create data products - without this, incentivization is tough. And constantly look to raise the visibility and amplify the wins.For Jen, it's been important to repeatedly paint a compelling vision in many conversations. It's fine to be a bit repetitive. Share the current picture and then talk about what it could become. This is important to making participants "willing to accept the pain of change". You want to develop a symbiotic, mutually beneficial relationship with those early adopters. Participating in data mesh has to be a win for them too. And teams aren't ready to just adopt data mesh, you need to create the processes to support and enable them.Jen wrapped up the conversation reiterating a few points about your vendor assessment process: 1) be prepared to spend more time on discovery than you probably think is necessary going in because it will highlight the pain points and capabilities that are most crucial; 2) focus on task-based use cases when considering necessary capabilities - keep the systems out of it; 3) really spend the time to understand your sourcing process internally; 4) it's okay to have very frank discussions with vendors - look to spend your AND their time wisely and share your hard constraints and requirements; and 5) constantly reflect back your progress and learnings in your assessment along the way and especially share the results of the assessment broadly to continue to share information and drive buy-in.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 5, 2022 • 12min
#97 What is a Mesh Data Product to the Business Owners and Users? - Mesh Musings 21
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Community open discussion meetup hosted by Eric Broda: https://www.youtube.com/watch?v=OwtQ37WYK1gPlease Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 4, 2022 • 1h 14min
#96 The Power of Empowerment and Driving Business Value: Data Mesh at Roche - Interview w/ Omar Khawaja
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Omar's LinkedIn: https://www.linkedin.com/in/kmaomar/Omar's State of Data Mesh presentation: https://www.youtube.com/watch?v=S5ABE4bevn4Adam Grant book Think Again: The Power of Knowing What You Don't Know: https://www.amazon.com/Think-Again-Power-Knowing-What/dp/B08HJQHNH9Lean Value Tree definition: https://openpracticelibrary.com/practice/lean-value-tree/In this episode, Scott interviewed Omar Khawaja, Head of Business Intelligence at Roche Diagnostics. To be clear, Omar was only representing his own viewpoints and learnings, not necessarily those of Roche.Some interesting thoughts/takeaways from Omar's point of view and learnings:If you are going to make progress in a data mesh journey, you must be okay with "good enough". Perfect is the enemy of good and done. Measure, learn, and adjust along the way but get moving and keep moving. It's okay to make mistakes - recognize and correct them.Echoing a number of past guests, change management and organizational challenges will take a large portion of a data mesh implementation leader's time and effort - likely far more than most would expect. Focus on empowering people and showing them why this can work for them. And what it means for them.Data mesh cannot be your entire data strategy. If you are implementing data mesh, it must only be part of your data strategy. Start from the why. Why undertake something as transformational and difficult as implementing data mesh? What business value will it deliver?Data-as-a-product thinking is the true heart of a data mesh implementation. It's far more than just creating data products. Data product discovery is crucial, much like discovery in regular product management. Take considerable learnings from product management in other disciplines.Focus on outcomes in day-to-day data work. What are you trying to deliver? What is the value in it? For whom? How will we measure if we are successful? And were we actually successful?We need to get data people to rethink creating point solutions - sometimes called project management thinking - where they deliver a dashboard and the dashboard itself is the focus. This leads to fragility that could be prevented by focusing on the entire data lifecycle to create the dashboard with the dashboard - and many other chances for data reuse - as an output.Roche is being quite flexible around who develops data products - it is all about the capabilities and needs. Often, it is data engineers in the domains, enabled by the central platform team. But it can be data/business analysts or software engineers too. If the data product isn't overly complex or if a business analyst really understands data, why can't they be the data product developer?It would have been the definition of insanity - trying the same thing over and over and expecting different results - for Roche to just move from an on-prem data lake that was having scaling and quality issues to a cloud data lake. Many other aspects needed to change. The organization needed to unlearn and relearn a number of things and data mesh was a great vision for where they could go.Roche saw some duplication of work across data products so they adjusted and made their data product discovery and design phases very public. Making it public can increase collaboration early in a data product's life as well so you might find additional data consumers in the development phase.Omar started the conversation with a definition of what Business Intelligence means to him and how it has evolved from a mostly reports-based function - or descriptive analytics - to include predictive analytics and then prescriptive analytics. But monolithic approaches - enterprise data warehouse, data lake, etc. - just haven't led to great outcomes for many (most?) large organizations. Omar quoted Albert Einstein, "Insanity is doing the same thing over and over and expecting different results." So why keep trying to throw technology and a monolithic architecture at our growing data and analytics challenges. So in 2020, Omar was happy to help evaluate if decentralization could work in data for Roche.They moved to a domain-aligned model with the business intelligence, analytics, and data engineers - at least those not building the platform - moving into the domains. That way, they can become the people who know the data best including the business context. They can help to shape and develop data products.For Omar, the way most analytics work has been done historically - and how people have thought about analytics work - the outcome of analysis itself is the focus. So if you build a dashboard, an analyst creating a dashboard might spend all their time focusing on the dashboard with little to no thought about the fragility of the inputs to the dashboard. What happens if upstream data changes? In data mesh, the dashboard is an output of data work that is more easily created and managed because the analyst knows the upstream will be maintained as a product and even if there are changes, there is communication about impending changes. And they might be able to reuse the data in another analysis.Regarding roles needed in domains for data mesh, Omar gave the incredibly common data mesh answer of "it depends". There needs to be an owner who understands the data-as-a-product concept - not just creating data products - as the lifecycle of data is crucial to doing data mesh well. But each domain and each data product has different needs; so focus on building the cross-disciplinary team needed to take care of the job now - and in the future - and not on the exact composition of each team. The use cases aren't cookie cutter, why make the teams cookie cutter.At Roche, for many domains, the initial data product design is done by a senior architect but the data product development is done by a data engineer - possibly called an analytics engineer - but if there isn't a need for extensive data engineering for the data product, it can be developed by a data/business analyst or software engineer. The role title doesn't end up mattering, the team capabilities and the needs do. Design with data consumers in mind.Per Omar, for all organizations, there is some kind of existing analytics practice - or brownfield. It is important to leverage what you've built historically - both the types of analytics and the team. You might have great insight into what information consumers want. And many of the people involved in the data warehouse or lake can evolve into a role that is highly valuable in data mesh - they probably know your data quite well. But, it is crucial that they understand why they need to evolve and are given the resources to do so.Echoing many previous guests, Omar said to focus on the outcomes in data mesh, not the exact structure. Empower your teams to figure it out and enable them to do the work, especially via the data platform. The role of the "citizen data scientist" never really worked; now we can focus on giving many more access to information and insights, not just access to data with little to no context. And sharing across the company is crucial to finding scalable, repeatable practices and patterns.At Thoughtworks' State of Data mesh conference, Omar presented how each of the four pillars or principles of data mesh represents Mind, Body, Heart, and Soul. Digging into data-as-a-product, or the heart of data mesh in Omar's analogy, there is far more than just creating data products. He shared their learnings in data product discovery - how do you figure out what products you need? What are the expected outcomes from creating this data product? Who is going to use it?Omar discussed the need for a real mindset shift to understand data-as-a-product especially. It doesn't come naturally for most people so you need very conscious change management - change management and organizational challenges will almost certainly take considerably more of your time in a data mesh implementation than you expect. And it's not shoving people forward or dragging them - it's taking them by the hand and working with them to find a good way forward. Once they feel the empowerment, Omar is seeing most people really like this new way of working. And they are focused on continual value delivery instead of the project, one-off type of value creation.When Omar joined Roche a couple of years ago, his first big task was forming the data strategy. Should they just do a data lake in the cloud? Was that transition working well for most organizations? The answer for him was no and he repeated about the definition of insanity - trying the same thing over and over and expecting different results. He took a lot of inspiration from the book Think Again by Adam Grant. There was a lot the organization needed to unlearn or relearn. And when COVID-19 hit, it forced organizations to get much better at communication and collaboration so data mesh was a great fit - they were already embracing decentralization.Omar talked about how data mesh was one of the aspects of their data strategy but only one aspect. The overall goal was and is "how do we become data-driven as a company?" So you should start from the why. Why consider data mesh? What are you trying to achieve? Is it worth it?This set of questions should also apply to everyone's day-to-day work in a data mesh implementation, per Omar. Again, focus on outcomes, what are you actually trying to deliver? Is it just a dashboard or are you trying to deliver the insights that dashboard will create? Omar recommended the Lean Value Tree approach as one way to focus your time.Omar returned to the concept of product mindset and product thinking in data. What value are you trying to deliver? Then how much value do you expect it will deliver? Then how will we measure if we are successful? Then were we successful in delivering expected value? A big part of this is the discovery process - drive towards a business focused discussion about outcomes.In Omar's view, no one will get really everything correct, much less in something as large, complex, and new as data mesh. It's okay to make mistakes. That's part of learning. But you need to get to a "good enough" place and move forward. Measure along the way and adjust. It's a journey, there will be trials and tribulations along the way. Learn from it and adjust. Collaborate and move forward together.In Roche's early journey, they found some teams were duplicating work so they moved to fix that. What they learned was they should provide very early visibility in to plans to prevent teams from spending time on the same things. The data product discovery and design phases are now quite public and that's worked well. Instead of teams duplicating work, they are often early consumers of data work other teams have done.In wrapping up, Omar again reiterated to focus on what you are trying to deliver, what is the value and that it's okay to move forward with an incomplete picture. You'll make some mistakes but prepare to learn and adjust and just get to making progress. Pretty sound advice.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 3, 2022 • 33min
Weekly Episode Summaries and Programming Notes - Week of July 3, 2022 - Data Mesh Radio
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jul 1, 2022 • 1h
#95 Measuring Your Data Mesh Journey Progress with Fitness Functions - Interview w/ Dave Colls
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center hereDave's LinkedIn: https://www.linkedin.com/in/davidcolls/Zhamak's data mesh book: https://www.oreilly.com/library/view/data-mesh/9781492092384/Building Evolutionary Architecture book: https://www.oreilly.com/library/view/building-evolutionary-architectures/9781491986356/Team Topologies book: https://teamtopologies.com/bookThe Agile Triangle regarding helping you decide how thin to slice: https://www.projectmanagement.com/blog/blogPostingView.cfm?blogPostingID=5325&thisPageURL=/blog-post/5325/The-Agile-Triangle#_=_In this episode, Scott interviewed Dave Colls, Director of Data and AI at Thoughtworks Australia. Scott invited Dave on due to a few pieces of content including a webinar on fitness functions with Zhamak in 2021. There aren't any actual bears, as guests or referenced, in the episode :)To start, some key takeaways/thoughts and remaining questions:Fitness functions are a very useful tool to assess questions of progress/success at a granular and easy-to-answer level. Those answers can then be summed up into a greater big picture. You should start with fitness functions early in your data mesh journey so you can also measure your progress along the way. To develop your fitness functions, ask "what does good look like?"Focus your fitness functions on measuring things that you will act on or are important to measuring success. Something like amount of data processed is probably a vanity metric - drive towards value-based measurements instead.Your fitness functions may lose relevance and that is okay. You should be measuring how well you are doing overall, not locking on to measuring the same thing every X time period. What helps you assess your success? Again, measure things you will act on, otherwise it's just a metric.Dave believes the reason to create - or genesis of - a mesh data product should be a specific use case. The data product can evolve to serve multiple consumers but to start, you should not create data products unless you know how it will (likely?) be consumed and have at least one consumer.Team Topologies can be an effective approach to implementing data mesh. Using the TT approach, the enablement team should focus on simultaneously 1) speeding the time to value of the specific stream-aligned teams they are collaborating with and 2) look for reusable patterns and implementation details to add to the platform to make future data product creation and management easier. We still don't have a great approach to evolving our data products to keep our analytical plane in sync with "the changing reality" of the actual domain on the operating plane. On the one hand, we want to maintain a picture of reality. On the other, data product evolution can cause issues for data consumers. So we must balance reflecting a fast-changing reality with data consumer disruption, including downstream and cross data product interoperability. There aren't great patterns for how to do that yet. There is a tradeoff to consider regarding mesh data product size. Dave recommends you resist the pull of historical data ways - and woes - of trying to tackle too much at once. The smaller the data product, the less scope it has, which makes it easier to maintain and the quicker to deploy and feedback cycle. But smaller-scope data products will increase the number of total data products, likely leading to harder data discovery. And do we have data product owners with many data products in their portfolios? Dave recommends using the Agile Triangle, framework to figure out a good data product scope (link at the end). Dave mentioned he first started discussing fitness functions regarding data mesh to shift the conversation from people asking "what do we build?" to "what does good look like?" Fitness functions, when done right, can give a good view of how well an organization is doing relative to data mesh implementation goals by providing objective measures of success at a granular level that can be summed up to a bigger picture.So what is a fitness function? As defined by Thoughtworks in a May 2018 technology radar, "Borrowed from evolutionary computing, a fitness function is used to summarize how close a given design solution is to achieving the set aims. ... An architectural fitness function , as defined in Building Evolutionary Architectures, provides an objective integrity assessment of some architectural characteristics, which may encompass existing verification criteria, such as unit testing, metrics, monitors, and so on." Source: https://www.thoughtworks.com/radar/techniques/architectural-fitness-functionFitness functions can take us from measuring success on vanity metrics - like amount of data processed or stored - to value-based metrics, per Dave. It is important to think about what good looks like for the now and the future. So starting to put your fitness functions in place early in your data mesh journey can give you a good sense of where you've been when designing where you want to go. Fitness functions give you an ability to stay focused on "why are we doing this?" - intentionality is crucial.When thinking concretely about some fitness functions for a data product, Dave gave a few examples. E.g. does this meet health checks for testing, is it satisfying SLOs, etc. It can be a good idea to have a target metric with yes/no type of answer as you start to use fitness functions. Metric measurements without context are typically not valuable. Latency of 5min might be great for one data product and not another. Accuracy of 90% might be atrocious for one and great for another.You can implement fitness functions for all aspects of a data mesh implementation in Dave's view. Look at the four key principles of data mesh from Zhamak's work and you can start to break down your goals for each one into fitness functions. A good overall question to try to answer is are we reducing the interdependence of domains. So for the domains, are they providing value via their data products to consumers? For the platform team, have they made it easier to create and manage data products? For governance, overall is the value of the whole implementation greater than the sum of the parts? You can answer fitness functions at a micro level and then take your overall measurements to get a more complete picture to assess how your implementation is going.For Dave, when assessing that bigger picture, as previously mentioned, it is good to think about your bigger picture and measurement of success over time. What you measure with fitness functions can - and should - evolve but having your rates and ratios spanning your implementation timeline can give you a good indication of where you've been improving and where you need to work more.Similar to Shane Gibson's episode on using patterns in data, fitness functions may be valuable to other organizations and not yours and they may lose relevance over time per Dave. Measure things that will cause you to act based on the outcomes. Nothing in your data mesh journey should really be seen as done and fixed. Things should evolve or it means your organization is stagnant. Change for the sake of change is obviously bad but you should evaluate if your fitness functions are still helping you measure against your idea of what good looks like. Do we need fitness functions against our fitness functions?Dave talked about how well Team Topologies aligns with implementing data mesh as organizational changes are such a crucial part to success. The Team Topologies approach focuses on enabling the domain team - called a stream-aligned team - to be "the primary unit of value delivery in IT". A platform team enables but loosely coordinates at most when possible to prevent blocking the work of the stream-aligned teams as much as possible. Per some past research conducted by Dave, it took 12x longer for a team to do work if they had to go outside their team - so prevent that if possible! But, Dave warned that right now, especially in data mesh, it is important to not just add more and more work to the stream-aligned team. Team Topologies can help us answer how do you build capabilities in a decentralized world, especially to implement something like data mesh. Per Dave, it is helpful when there is multi-disciplinary collaboration by the stream-aligned teams with the enablement teams to develop their first data products. The enablement team is also tasked with bringing back incremental learnings to add to the platform to make the next team's work creating and maintaining data products better - continuous improvement via learning from each implementation.On this topic, Dave talked about how useful it is to optimize for learning - rather than optimize purely for initial value-creation of each data product - in a data mesh implementation. With the enablement teams bringing learning back to improve the core platform, we can implement friction-reducing enhancements like sensible defaults and starter kits/templates. Focus on efficiently learning.When designing/creating a mesh data product, Dave recommends a customer-led approach. There should be a consumer with a specific use-case as the reason to create a data product. A mesh data product should be a valuable representation of the domain through data. But the operational and analytical planes naturally diverge unless we evolve the analytical to match the new reality in the operational. But we don't have a great way of evolving that analytical without data consumer disruption.Per Dave, creating a "thin slice" is one way to help maintain representing your domains via mesh data products. Dave recommends you resist the pull - and resulting woes - of past data ways of trying to tackle too much at once - so thin slicing. Get to value delivery and feedback quickly. A thinner slice has a reduced scope so you will likely not have as much difficulty maintaining that singular data product. But Scott wanted to make sure people understand that the micromicroservices model - as in small even for microservices - was kind of a disaster so be careful to not slice too thinly - you don't want to have way too many data products - it can also make data discovery more difficult. It's all tradeoffs in the end :) And you can use fitness functions to measure if you are making the right tradeoffs too!Dave's LinkedIn: https://www.linkedin.com/in/davidcolls/Zhamak's data mesh book: https://www.oreilly.com/library/view/data-mesh/9781492092384/Building Evolutionary Architecture book: https://www.oreilly.com/library/view/building-evolutionary-architectures/9781491986356/Team Topologies book: https://teamtopologies.com/bookThe Agile Triangle regarding helping you decide how thin to slice: https://www.projectmanagement.com/blog/blogPostingView.cfm?blogPostingID=5325&thisPageURL=/blog-post/5325/The-Agile-Triangle#_=_Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jun 28, 2022 • 8min
#94 Data Mesh Therapy - Come Vent and Chat - Mesh Musings 20
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sign up here for a Data Mesh Therapy session here: https://calendly.com/data-as-a-product/data-mesh-therapyPlease Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Jun 27, 2022 • 1h 11min
#93 Empower to the People: Data Collaboration and Observability at Enterprise Scale - Interview w/ Jay Sen
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.Jay's LinkedIn: https://www.linkedin.com/in/jaysen2/Posts by Jay:Next-Gen Data Movement Platform at PayPal: https://medium.com/paypal-tech/next-gen-data-movement-platform-at-paypal-100f70a7a6bHow PayPal Moves Secure and Encrypted Data Across Security Zones: https://medium.com/paypal-tech/how-paypal-moves-secure-and-encrypted-data-across-security-zones-10010c1788ceThe Evolution of Data-Movement Systems: https://jaysen99.medium.com/evolution-of-data-movement-f12614d6e9deIn this episode, Scott interviewed Jay Sen, Data Platforms & Domain Expert/Builder and OSS Committer. While Jay currently works at PayPal, he was only representing his own view points.Some key takeaways/thoughts from Jay's view:When you get to a certain scale, any central team should focus on, as Jay said, "Empower people, don't try do their jobs." That's how you build towards scale and maintain flexibility - your centralized team likely won't become a bottleneck if they aren't making decisions on behalf of other teams.To actually empower other teams, dig into the actual business need and work backwards to a solution that can solve that. If there is a solution already in place that isn't working any more, look to find ways to augment that rather than trying to replace or reinvent the wheel.Self-service is a slippery slope - it often solves the immediate problem of time to market but also creates next level challenges. A big issue is that when you remove the friction to data access, you are throwing challenge of finding right data on consumers plate. Data contracts are great when everybody aligns on a single contract and there are enough tools to support the contracts. But they also create a proliferation of data to enforce the contracts required by multiple consumers - thus, they often don't survive the real world.The data catalog space is finally getting some needed attention. But there are still a myriad of issues that need solving. Will those be solved by technology or by leveraging a "data concierge" remains to be seen.It's insanely easy to overspend in the cloud. Everyone is vaguely aware but cost should be part of every important architectural discussion. You can drive business value but it absolutely must also be focused on the cost as return on investment is far more important than simply return.Jay took a few lessons from working on a central services team in a company of ~200 people. Having a centralized team was doable at first but as the org scaled, it quickly got complicated. As a centralized team, it's very easy to become a bottleneck but Jay learned a lesson that has continued to help in his subsequent roles: "empower people, don't do their jobs." Focus on reducing the friction to others doing their work instead of doing it for them.Easier said than done so how do you empower people? Per Jay, you must understand the business aspect and what the requestor actually needs. That isn't really going to get communicated well in a ticket most times so you should have a high context information exchange to take what they need and convert it into a workable solution. And often, there is already a solution in place but it's just not handling the job anymore. So you want to consider if you should solve the same issues in a better way. It's much easier to do a greenfield deploy but brownfield is an inevitable facet of enterprise data work.Per Jay, a few good things to remember: 1) frameworks and technology come and go but the concepts are the things that stick around. Focus on solving issues by leveraging technology and frameworks, not relying on them. 2) When working in data, you can't favor data producers or consumers over the other. It is easy for many to align with consumers but all stakeholders need assistance. 3) Beware the cool and trendy tech or approaches. Yes, it's funny to say that in a data mesh podcast but often, engineers just want to play with cool things and take on gnarly challenges. Stay focused on the business issues and value. 4) Trust in data takes a long time to build and seconds to break in Jay's experience.For Jay, self-service can be a very slippery slope. It often creates more issues than it solves. But it feels like it solves something and that is the allure. Part of the issue of self-serve is that it removes the necessary governance in order to reduce friction to accessing data. So you can get to data but don't have the understanding necessary to actually use it properly. It's also crucial to embed governance into data access if you are going for something like self-serve.There is often a high cost to adopting any technology in Jay's view. Not just initial adoption but tool stewardship. But it's typically much higher for the latest technology. Make sure if you are looking at an immature solution that you're focused on solving the right problems.Jay shared that data contracts are evolving in a good direction. But, they are still not addressing all the challenges people want data contracts to handle. They can be really good at making sure people understand their responsibilities. As Emily Gorcenski noted in her episode as well, you must drive to meaningful conversations between producers and consumers to define quality very specifically as well as other SLOs. In Jay's view, contracts work in an ideal world of one-to-one communication between domains but often, there are multiple parties from each side that view things slightly differently. So the contracts rarely fully cover all use cases and can be, at best, a good conversation point for negotiations.Jay is excited that the data catalog space is getting some very necessary attention. Every company of any size is now dealing with petabyte level of data so organizing it is becoming a major necessity. There are quite a few challenges left to tackle: 1) automated dataset discovery and definitions; 2) naming conventions still aren't standardized; 3) over-reliance on auto-documentation when human input is required; 4) how do we build trust? 5) how can we empower the data applications? 6) how do we deal with the trapped metadata problem? etc. Jay believes that the catalog must have the understanding not just of what the dataset is but also why it exists.When asked about whether he thought systems or people should be the focus in enabling data discovery, Jay said to focus on systems to make the onboarding experience the best it can be. That will make it easiest for people as you scale. Scott disagrees and believes a "Data Concierge" role will serve organizations well - but that exceedingly few organizations will actually create and leverage such a role.Jay then shared his thoughts on understanding, containing, and preventing unnecessary costs in data management in the cloud. It's quite important as it is VERY easy to spend a lot when you move to the cloud. Scott agreed as he previously managed AWS costs for a public company and saw that first hand. Jay pointed out that it's often a bad idea to do a one-to-one mapping of what you were doing on-prem to the cloud. The cost structure is often very different and it can cost you a fortune. But rearchitecting also has a cost. Evaluating cost should play a role in every part of data work if you really want to drive good business outcomes.In wrapping up, Jay reiterated that technology needs to solve real business problems, not just be cool. Really consider the long term costs of adopting a new solution.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf