Data Mesh Radio cover image

Data Mesh Radio

Latest episodes

undefined
Mar 25, 2024 • 1h 8min

#298 Effective Partnering With Business Execs - Learnings from Another Data Mesh Journey - Interview w/ Jessika Milhomem

Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Jessika's LinkedIn: https://www.linkedin.com/in/jmilhomem/In this episode, Scott interviewed Jessika Milhomem, Analytics Engineering Manager and Global Fraud Data Squad Leader at Nubank. To be clear, she was only representing her own views on the episode.Some key takeaways/thoughts from Jessika's point of view:There are no silver bullets in data. Be prepared to make trade-offs. And make non data folks understand that too!Far too often, people are looking only at a target end-result of leveraging data. Many execs aren't leaning in to how to actually work with the data, set themselves up to succeed through data. Data isn't a magic wand, it takes effort to drive results.Relatedly, there is a disconnect between the impact of bad quality data and what business partners need to do to ensure data is high enough quality for them.Poor data quality results in 4 potential issues that cost the company: regulatory violations/fines, higher operational costs, loss of revenue, and negative reputational impact.There's a real lack of understanding by the business execs of how the data work ties directly into their strategy and day-to-day. It's not integrated. Good data work isn't simply an output, it needs to be integrated into your general business initiatives.More business execs really need to embrace data as a product and data product thinking. Instead of a focus on only the short-term impact of data - typically answering a single question - how can we integrate data into our work to drive short, mid, and long-term value??Controversial?: In data mesh, within larger domains like Marketing or Credit Cards in a bank, it is absolutely okay to have a centralized data team rather than trying to have smaller data product teams in each subdomain. Scott note: this is actually a common pattern and seems to work well. Relatedly, the pattern of centralized data teams in the domains leads to easier compliance with regulators because there is one team focused on reporting one view instead of trying to have multiple teams contribute to that view.When you really start to federate data ownership, business execs can now partner far easier with other business execs in other domains leveraging data. Instead of having the central data team trying to translate, there is a focus on what needs to get done and the data work flows from that instead of the data work being the focus. It's the engine that powers their collaboration but it's no longer 'the point'.Partnering with those who "are closer to the reality" of the business, it's easier and more likely to drive good outcomes. Meaning: not the senior execs. But the senior execs often have to be on board with the work and the target results. So work on communicating up but closely collaborating at lower levels.Data for regulators often has a LOT of potential reuse for your own organization. Lean into finding those areas where you can do the data work once and get value twice :)?Controversial?: Really consider role titles in data mesh. Data product owner might be too nebulous and quickly accumulate too many responsibilities. Data product manager is easier to understand the scope of responsibilities and the specific areas of focus. Scott note: this comes up A LOT and is generally starting with data product owner and moving to data product manager.?Controversial?: Data leaders need to understand product management. To really scale data work, we have to start treating all aspects as a product practice. CTOs down to software engineers need to understand product management, it's time for the data org to as well.Data leaders need to have significant communication skills while maintaining their understandings of data best practices. It's all a delicate balance but the data work doesn't speak for itself.Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Mar 18, 2024 • 58min

#297 Panel: Understanding and Leveraging the Data Value Chain - Led by Marisa Fish w/ Tina Albrecht, Karolina Stosio, and Kinda El Maarry, PhD

Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Marisa's LinkedIn: https://www.linkedin.com/in/marisafish/Karolina's LinkedIn: https://www.linkedin.com/in/karolinastosio/Tina's LinkedIn: https://www.linkedin.com/in/christina-albrecht-69a6833a/Kinda's LinkedIn: https://www.linkedin.com/in/kindamaarry/In this episode, guest host Marisa Fish (guest of episode #115), Senior Technical Architect at Salesforce facilitated a discussion with Kinda El Maarry, PhD, Director of Data Governance and Business Intelligence at Prima (guest of episode #246), Tina Albrecht, Senior Director Transformation at Exxeta (guest of episode #228), and Karolina Stosio, Senior Project Manager of AI at Munich Re. As per usual, all guests were only reflecting their own views.The topic for this panel was understanding and leveraging the data value chain. This is a complicated but crucial topic as so many companies struggle to understand the collection + storage, processing, and then specifically usage of data to drive value. There is way too much focus on the processing as if upstream of processing isn't a crucial aspect and as if value just happens by creating high-quality data.A note from Marisa: Our panel is comprised of a group of data professionals who study business, architecture, artificial intelligence, and data because we want to know how (direct) data adds value to the development of goods and services within a business; and how (indirect) data enables that development. Most importantly, we want to help stakeholders better understand why data is critical to their organization's business administration strategy and is a keystone in their value chain.Also, we lost Karolina for a bit there towards the end due to a spotty internet connection.Scott note: As per usual, I share my takeaways rather than trying to reflect the nuance of the panelists' views individually.Scott's Top Takeaways:If you want to dig deeper into the data value chain, consider looking into the value streams concept. What flows through your business in terms of process to generate value? Where are there points of value leakage? The same concepts are crucial in your value chain.Organizations need to really educate their entire organization on the data value chain. Part of why there are so many issues in data from upstream changes by developers breaking downstream data is they simply don't know what parts of their data are used and why. Communication is a much bigger aspect of doing data than people think.Even talking about the specific data value chain can cause people to focus too much on the data work instead of the business value delivered via data. The data value chain is crucial to understand but it's also crucial to understand data work doesn't inherently create value, it's about how it's used in the business. Dig into the value created and focus on working backwards from that to what data work needs to be done.The data value chain is crucial for companies of all sizes across all industries. At its heart, the concept is about focusing on ensuring you aren't leaking value in your business value streams/pipelines. You need to focus on what drives value and how to improve the processes there.Data value chains often cross line of business/domain boundaries. After all, a lot of the value of data is about combining information across those boundaries. That can mean cross-team handoffs, which make understanding and ensuring the success of those data value chains even harder. Who owns what isn't inherently understood/agreed to, you need to get specific.It's important to not get overly focused on a single end-point of value when it comes to data work, especially when it comes to a data product. If we want re-use, we have to focus on the processes of creating reusable value. Maintaining that larger picture focus while still ensuring each data consumer can still get value from a data product is a very hard balance.Focusing heavily on your data value chain is going to be hard. It means hard work and a lot of internal collaboration - and thus negotiation - across domain boundaries. You all have to be in it together to really get the best results - and some organizations aren't ready for that. But the hard work pays off because you are ensuring value actually gets created.As with anything in data, you have to make bets. That doesn't mean every bit of data work will create significant value or even exceed the investment. But an approach like data value chain is crucial to understand 1) what bets are you making and why and 2) who owns what aspects of the data work. That can help you really focus on the what and why rather than focusing on outputs.Other Important Takeaways (many touch on similar points from different aspects):As with many things in data, ownership is crucial to understanding your value chain. The weakest points in a value chain are the handoffs between teams. Strong ownership, including of those handoffs, prevents value leakage (from the value streams concept). To understand your data value chain, you will have to go deeper than many are willing to in the (dreaded?) operational plane. You have to understand what data you have, how it's collected, what data you can collect, etc. Some of it is working backward from what data you need/want but a lot of it is working from what data you have or can get.Relatedly, the value you can create from data is heavily reliant on what matters to the business. To think about value, you have to understand your business processes and what generates actual value.You really need to consider your approach to data collection and storage. How do you want to consider data that may have value but hasn't yet proven to have value? You don't want to have costs go out of control and most data is never back-cleaned/filled if it wasn't collected and stored for use. But you can't know all your data use cases at the launch of a new application or product. It's a balancing act.There is a question of how mature do you need to be as an organization to actually really consider using data value chain as a framework instead of merely some principles to guide your work. It can be hard to get people to understand the value and what drives that value in data when they don't understand data work in general.Relatedly, really digging into the data value chain can shine a light on underperforming activities inside and outside the data function. So you need to be prepared for some hard realizations and questions. Are you ready for transparency?What aspect of data value chains fall on the business? It's a hard question. At the end of the day, data value chains are supportive of the business value chains/streams but it depends on who has ownership over data work: the lines of business or a centralized team. Your data value chains should have explicit ownership, at least of the different 'links' of the chain.In data mesh especially but true in any data work, it's important to not see the data product as the end of the data value chain. The data product is there to make it easy for producers to reliably and scalably deliver value through data. But there is only value if that data is consumed, the value happens when someone takes action.When launching new applications/products, you have to consider what data you might want to collect even if you don't need it right at the start. Especially if that is something like hardware where you can't augment many aspects of the devices once they've been deployed.Focusing on data value chains is a mindset shift for most organizations, much like data as a product thinking. You need to get people to stop handwaving about aspects of data work and focus specifically on value and understanding that all parts of the data creation and transformation process are crucial to driving rich and sustainable value from data.Even if you do a good job at understanding your data value chains, there will still need to be rework. But it can help you prioritize data rework - you aren't going to get your data preparation perfect, especially for multiple consumers, on the first try.You have to be realistic about your data value. Your company probably won't value data and analytics that are internal facing as much as they do external-facing interactions until you prove out the value of treating those internal users with as much care. Part of that is getting specific about how much value you are generating and how :)At some point in your value chain, you aren't dealing with raw data anymore. Think about who wants what and why. Most execs want aggregated information - again, that point of driving business value instead of data work. Make sure there is clear communication to drive outcomes instead of outputs.A data value chain isn't about getting everything perfect upfront. Everything is about incremental delivery and getting better. What is the cost/benefit of that improvement? Get something out that works and is supportable/stable and then improve. Iteration is your friend.When thinking about your data value chain, it's usually best to focus again on target business outcomes/objectives. After all, that is where the value is. You can get more business people interested in data work if you are constantly talking in their language about their key objectives.Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Mar 11, 2024 • 1h 3min

#296 Patience in Product Thinking in Data - Building to Large-Scale Behavior Change - Interview w/ Darren Wood

Exploring patience and product thinking in data, they discuss the challenges of managing data products, prioritizing value delivery, and upskilling in data analysis. They shift focus to minimum lovable products, collaboration in data transformation, and engaging with sponsors. Overall, it's about understanding behavior change for successful data implementation.
undefined
Mar 4, 2024 • 1h 16min

#295 Data Shouldn't be a Four-Letter Word - Making Data a Forethought - Interview w/ Wendy Turner-Williams

An interview with Wendy Turner-Williams delves into integrating AI, data, ethics, privacy, and security for ethical AI practices. Discussions include aligning data work with business strategy, emphasizing communication, automation, proactive data management, and enhancing data literacy for business improvement.
undefined
Feb 26, 2024 • 1h 3min

#294 Panel: Product Discovery and Data Discoverability in a Data Mesh World - Led by Ecem Biyik w/ Frannie Helforoush, Marta Debska-Barcinska, and Ole Olesen-Bagneux

Panelists discuss product discovery and data discoverability in a Data Mesh world, highlighting the importance of partnerships, user experience, and stable data architecture. They explore challenges in data product discovery, emphasize continuous engagement, and share insights on creating valuable data products through innovative approaches.
undefined
Feb 19, 2024 • 1h 6min

#293 Adapting Product Management to Data - Finding the Customer Pain and the Value - Interview w/ Amritha Arun Babu Mysore

Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Amritha's LinkedIn: https://www.linkedin.com/in/amritha-arun-babu-a2273729/In this episode, Scott interviewed Amritha Arun Babu Mysore, Manager of Technical Product Management in ML at Amazon. To be clear, she was only representing only own views on the episode.In this episode, we use the phrase 'data product management' to mean 'product management around data' rather than specific to product management for data products. It can apply to data products but also something like an ML model or pipeline which will be called 'data elements' in this write-up.Some key takeaways/thoughts from Amritha's point of view:"As a product manager, it's just part of the job that you have to work backwards from a customer pain point." If you aren't building to a customer pain, if you don't have a customer, is it even a product? Always focus on who you are building a product for, why, and what is the impact. Data product management is different from software product management in a few key ways. In software, you are focused "on solving a particular user problem." In data, you have the same goal but there are often more complications like not owning the source of your data and potentially more related problems to solve across multiple users.In data product management, start from the user journey and the user problem then work back to not only what a solution looks like but also what data you need. What are the sources and then do they exist yet?Product management is about delivering business value. Data product management is no different. Always come back to the business value from addressing the user problem.Even your data cleaning methodology can impact your data. Make sure consumers that care - usually data scientists - are aware of the decisions you've made. Bring them in as early as possible to help you make decisions that work for all.?Controversial?: Try not to over customize your solutions but oftentimes you will still need to really consider the very specific needs of your consumers. Build for reuse but also build where your consumers are actually having their needs met. A mediocre solution for all is usually worse than a few specialized solutions.Prioritization is crucial in product management. That applies to features within the products but also the products themselves. There are many potential use cases that won't be met because there isn't enough value. That's the name of the game, return on investment; it's not about capturing all value possible.Communication and building relationships/trust are foundational in product management. It's an art as much as a science. If you can't have tough conversations and get alignment, it is FAR harder to build a product that meets customer's needs.Relatedly, establish regular communication with your customers. You shouldn't only be talking to them when things go wrong. Stay on top of what is driving value for them and look to augment your product proactively, not only reactively. Product management requires patience as much as diligence. Sometimes your data product/element violates its SLAs but it's an outlier, a one-off. Don't look to overreact and jump to changing things. But you obviously need to have serious conversations if elements aren't meeting expectations over a more extended time period.If you aren't sure what products you should create in a new area, talk to people and find the points of friction. What are the pain points and is there enough value in addressing them to justify doing the work?It's crucial to deeply converse with potential users of a data product/element to assess if it's really going to be worth the effort. There is always a chance you build something that isn't used/valuable but through deep investigation and ideation with potential customers, you can avoid that far more often.When you are building something, even before it hits 'GA', get validation. You can save yourself a ton of effort in rework as you find a better solution sooner.Product management is about collaborating to drive towards value. You are there to prioritize and coordinate. You don't have to know everything, but your job is to uncover as much understanding as possible to maximize your value creation and minimize wasted work.Always ask what value building something for your customer will drive. But also ask what happens if we don't build it. What is the cost of not acting?The only constant is change, especially in data. Leverage a "loosely dependent architecture" to be able to adapt to change. And be open and honest with customers that things will change. Emphasize you'll work with them to adapt to those changes.Amritha started the conversation on some key differences between software product management and product management around data - whether specific to 'data products' or not. One similarity is the focus on solving a particular user problem but in data, you might have to build something to address multiple users' problems. A much bigger difference is that in data, you often don't own the entire process as you might be reliant on others to source your data. In software, you are generally building the data sourcing because you own the interaction creating the data. How the data is stored and collected throughout the upstream process impacts what you can do.The user problem, the business value, and the user journey are some key guides to doing data product management well for Amritha. Keep coming back to those as you build out your solution. Focus on understanding what the user really needs and work backwards to the sources. And then of course focus on making sure you are actually addressing user needs when you deploy the solution. There are many reasons a data element may not be performing up to expectations so be prepared to deep dive; is there a problem with what you've built, what's feeding your data element - maybe sources have changed or there is a quality issue -, or is it just not performing to expectations because the hypothesis was wrong?Amritha dug a bit more into some challenges specific to product management in machine learning and AI. While data scientists want clean data, when possible they want to even be part of the process of selecting the cleaning methodologies - even that can impact the data enough to change outcomes. So really start from the process of bringing them in as a stakeholder as soon as you can and don't throw data over the wall at them. And if you already have something developed, share your methodologies and help them figure out if it's the right fit for them or if something new needs to be developed. Again, we want reuse but we also want solutions that address their problems. Always a hard set of needles to thread."As a product manager, it's just part of the job that you have to work backwards from a customer pain point." Amritha questions if you are even building a product if you don't have a customer. What is the business value of the work? For a product manager of product without a customer, are you focused on your own thoughts and biases rather than the needs of consumers? "So the point here is that at any given point, you have to be cognizant of who are you building this for, why, and what that is the primary customer. And the secondary is: who else if I build this, what are the impacts it will have on my secondary customers, or other downstream or interacting applications?"Amritha talked about one crucial rule in product management: prioritize. There are many use cases you _could_ solve but are they actually worth the effort? Think about what will impact your organization the most. Don't try to solve every use case and don't try to make products that can serve every potential customer - focus on delivering value. Scott note: this can be a slippery slope in data mesh. You want to take on use cases you actually can tackle when you are learning. Don't only go for the biggest value but also tackle problems where the juice is worth the squeeze, where the outcome is worth the effort.In product management, Amritha believes it's absolutely crucial to understand the art and the science. The science is more about is this product specifically meeting the needs it was designed for. Basically, measuring the level of success and determining if that's good enough or especially is it _still_ good enough. But even that last bit can be a bit of art. The real art is all about communication and building relationships. If you build the world's objectively best product but no one trusts it or understands it enough to use it, it's not a valuable product. You must build strong relationships and have the tough conversations with stakeholders, earning their trust, to align on what needs to get built and why as well as when a product isn't meeting expectations. Establish regular lines of communication so it's not that the only time you talk to your customers, it's bad news or big changes. Continue to extract information from them to drive to business value.When it comes back to the science, that's when Amritha believes you should dig into the why something isn't meeting expectations from the technical perspective :) And have some patience around that. Sometimes it's a blip on the radar, not anything more.When figuring out what products/data elements you might want to build in a specific area, Amritha recommends digging into the potential workflows and user journeys. Start to really think about what you think could exist and why. But, instead of trying to ideate only yourself, go and talk to people and listen for their pain and points of friction. They may not even realize they have pain but you can find the challenges that people will want to address. Again, work backwards from the user journeys to discover what products you should build 😅Amritha talked about how to make maximize the chance that what you're building will be used/valuable. A lot of it is simply digging in deep with potential customers in the ideation phase to make sure this will actually drive value. There are ways to do that but a lot of it is simply spending the time to really understand the likely impact of what you're building. As Alla Hale said in episode #122, "What would having this unlock for you?" Also, ask, "what if we don't do this, what is the impact of not doing this?" And make sure to get validation as you're building. It might be the value hypothesis was wrong or that you're building something that is the wrong or suboptimal way to address the challenge/opportunity. You can save yourself a lot of headaches and rework. It's all about that collaboration to drive to value.In wrapping up, Amritha talked about how changes, especially in data, are inevitable. Make sure to communicate with consumers so they have realistic expectations. Sometimes those are proactive changes but often, you don't have that much control over changes, especially coming from upstream in data. Look to build in a way that can adapt and leverage a "loosely dependent architecture". Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
undefined
Feb 12, 2024 • 1h 6min

#292 Aligning Your Data Transformation to the Business - Interview w/ Nailya Sabirzyanova

Scott interviews Nailya Sabirzyanova, Digitalization Manager at DHL, on aligning data transformation to the business. Key takeaways include the importance of aligning application, business, and data architectures, involving business counterparts, transitioning to data mesh, leadership support, coordinating data across domains, and the need for conversations and alignment with business objectives.
undefined
Feb 5, 2024 • 1h 2min

#291 Panel: Data as a Product in Practice - Led by Jen Tedrow w/ Martina Ivaničová and Xavier Gumara Rigol

Guests Martina Ivaničová, a Data as a product expert, and Xavier Gumara Rigol, a Data as a product practitioner, discuss the importance of treating data as a product and creating a valuable experience for customers. They explore challenges in spreading information, discrepancies in data strategy perspectives, and implementing data as a product. The chapter concludes by emphasizing the competitive advantage of investing in data products.
undefined
Jan 29, 2024 • 1h 6min

#290 Applying Platform Engineering Best Practices to Your Mesh Data Platform - Interview w/ Tom De Wolf

Discover how platform engineering best practices can enhance your Data Mesh platform with insightful tips from Tom De Wolf. Learn about balancing freedom and control, feedback cycles, and the importance of simplifying complexities to create valuable data products.
undefined
Jan 22, 2024 • 51min

#289 Building the Right Foundations for Generative AI - Interview w/ May Xu

Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.May's LinkedIn: https://www.linkedin.com/in/may-xu-sydney/In this episode, Scott interviewed May Xu, Head of Technology, APAC Digital Engineering at Thoughtworks. To be clear, she was only representing her own views on the episode.We will use the terms GenAI and LLMs to mean Generative AI and Large-Language Models in this write-up rather than use the entire phrase each time :)Some key takeaways/thoughts from May's point of view:Garbage-in, garbage-out: if you don't have good quality data - across many dimensions - and "solid data architecture", you won't get good results from trying to leverage LLMs on your data. Or really on most of your data initiatives 😅There are 3 approaches to LLMs: train your own, start from pre-trained and tune them, or use existing pre-trained models. Many organizations should focus on the second.Relatedly, per a survey, most organizations understand they aren't capable of training their own LLMs from scratch at this point.It will likely take any organization around three months at least to train their own LLM from scratch. Parallel training and throwing money at the problem can only take you so far. And you need a LOT of high-quality data to train an LLM from scratch.There's a trend towards more people exploring and leveraging models that aren't so 'large', that have fewer parameters. They can often perform specific tasks better than general large parameter models.Similarly, there is a trend towards organizations exploring more domain-specific models instead of general purpose models like ChatGPT.?Controversial?: Machines have given humanity scalability through predictability and reliability. But GenAI inherently lacks predictability. You have to treat GenAI like working with a person and that means less inherent trust in their responses.Generative AI is definitely not the right approach to all problems. As always, you have to understand your tradeoffs. If you don’t feed your GenAI the right information, it will give you bad answers. It only knows what it has been told.Always start from the problem you are trying to solve rather than the approach you are trying to use. Then evaluate if GenAI is the right approach for that problem. Simple, fundamental stuff but it's crucial to remember: start with the problem before the proposed solution.Many people are leaping to use GenAI because their past approaches to certain problems haven't worked. Dig into those pains. GenAI may or may not be the right approach but either way it can be great for surfacing persistent challenges.Leverage people's enthusiasm for GenAI to have deeper conversations about general business challenges. It can really start to highlight friction points across organizational boundaries and who is responsible for what. Scott note: But as the data team, be careful not to try to fix the entire organization, that's not what you are responsible for 😅Right now, despite all the hype, most organizations are still at most in small-scale PoCs around GenAI. There is less of an initial focus on return on investment versus what capabilities GenAI might unlock but there is also a focus on what risks GenAI may introduce. Despite the hype, many to most organizations are doing their diligence.May started with three general approaches organizations are taking to generative AI (GenAI): 1) building their own LLMs from scratch, 2) fine tune specific, pre-trained existing LLMs, or 3) leverage pre-trained LLMs as is. Many organizations may want to do the first but it is prohibitively expensive to train your own LLMs from scratch just for the compute and you also need (very expensive) people with very specific expertise to do so. Tuning pre-trained models will likely become the standard approach for many organizations. However, being able to leverage LLMs on internal data in general requires "existing good quality data and solid data architecture."When considering training a model from scratch, May also pointed to time as an issue. Typically, it takes at least three months to properly train an LLM from scratch. Parallel training is helpful but you need to fine-tune results and retrain so you can't just throw compute at it and make the process that much faster. So again, you need high quality data - and you need a LOT of it - plus a fair amount of time plus a ton of money. Once you are in production, it also takes a lot of money and effort to keep them running and tuned properly 😅 Luckily, according to some surveys Thoughtworks did, most organizations recognize training LLMs from scratch isn't the right call for them just yet. May is seeing a trend of people moving away from the 'bigger is better' mentality. More people are starting to explore more targeted and specialized models that have fewer parameters. And often, for specific tasks, they perform better than the first L in LLMs. So we may see a trend towards more and more targeted LLMs/models. Scott note: Madhav Srinath really leaned into this in his episode, #264.Humanity in general has benefited greatly from machines through predictability and reliability according to May. Essentially, if they are made well, you essentially know what you should/will get from machines. But GenAI is designed specifically to act like humans and humans are not predictable and often not that reliable. So people have to get used to interacting with machines that may give wrong answers and are designed - in a way - to do so 😅 We can't expect predictability and reliability from GenAI.Relatedly, when thinking about where is GenAI the right choice versus like traditional machine learning/AI, May believes you really have to dig into the tradeoffs. If you really understand the problem set and what you are trying to accomplish, traditional ML/AI is probably the better approach for you. You need to really understand where the strengths of GenAI will play and feed it the data/information it needs to succeed, otherwise you'll be asking an uniformed and unpredictable entity to solve your most pressing business problems. That's probably not going to go well…May talked about going back to the basics of problem solving when it comes to Generative AI: what problem are you trying to solve instead of what way are you trying to solve a problem and then finding your way back to the problem. It can sound obvious but really, many are in such a rush to leverage these tools, it's crucial to stop and consider. Start with the problem before the solution 😅GenAI may also surface a number of internal business challenges that aren't spoken about or people have essentially given up on tackling previously according to May. We have a new tool in the toolbox so people want to see if it will be useful to tackle something they haven't been able to address well previously. Lean into GenAI as a conversational lubricant. GenAI may not be the right tool for every one of these challenges but it means there is more internal conversation and sharing :)From what May is seeing, many to most organizations are still in the early experimenting and PoC phase with Generative AI. They are trying to figure out what opportunities GenAI brings and also what risks. Despite the hype, people are taking their time but they aren't as focused on initial return on investment, more to validate if they can actually leverage GenAI to create value. Also, there is strong trend towards domain-specific LLMs rather than general purpose ones, e.g. financial sector or media specific models.May finished on the idea that data mesh and other data management paradigms are crucial to doing something like GenAI right. There is still a strong need for quality data that is accessible, interoperable, privacy-aware, secured, etc. to be able to leverage GenAI well.Learn more about Data Mesh Understanding: https://datameshunderstanding.com/aboutData Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode