AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Understanding the Context and Purpose of Data: Shifting Mindsets
This chapter emphasizes the need for a people-focused approach to data and invites listeners to engage in conversations about data contracts and related topics.
Please Rate and Review us on your podcast app of choice!
Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.
Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.
Ryan's LinkedIn: https://www.linkedin.com/in/ryancollingwood/
In this episode, Scott interviewed Ryan Collingwood, Head of Data and Analytics at OrotonGroup. To be clear, he was only representing his own views on the episode.
Some key takeaways/thoughts from Ryan's point of view:
Ryan started off with some framing of how he looks at tech approaches in general but especially how he started looking at data contracts. Most paradigms are presented as if every organization is very tech-y, like a tech startup. With data contracts, much of the content "…there was this assumption that you had multiple teams of people that had a fairly high degree of technical sophistication, … or maybe even data was their primary focus." So when a less tech-y company wants to leverage the paradigm, there is always some adjustments necessary 😅 and when it comes to those types of companies, it’s so much more about the people than the way most paradigms are presented. It makes some sense because every org's ways of working and culture are different but it still can feel very removed from reality for less tech-heavy companies.
When focusing specifically on data contracts, Ryan's company is far more batch than streaming. So trying to even leverage the best advice (Scott note: I highly recommend Andrew Jones for that), he had to adjust some aspects to a world where things were a bit more messy and with teams that aren't as data mature. When approaching how to tweak data contracts to still work, he asked the rhetorical but crucial question: "What are the trade-offs that I can make, while still being true to the value and the benefits that I want to get out of this?"
Ryan moved into what he sees as the minimum viable value aspects of data contracts. You need two parties, you need an agreement of some kind that is recorded, and you need access to data that conforms to the agreement*. As to the parts of the agreement, Ryan focused on two factors at the start: semantics and data quality. If people can't understand the data can they use it? If they don't understand the quality, can they really trust it enough to rely on it? So they worked to create a data dictionary and also provide people a better understanding of the different angles on data quality.
* Scott note: this could somewhat disagree with the idea many have around data contracts of merely publishing data with SLAs because while there is a consuming party, they aren't really part of the agreement, they only choose to use the data based on the existing SLAs/contract around it. There's lots of nuance but I HIGHLY believe in the communication-heavy aspect Ryan and Andrew Jones both present.
Often, when comparing with what was presented for a tech-heavy company to what is possible at a more regular organization can be disheartening according to Ryan. The idea that the end picture at your organization should look like the one presented is pervasive. So it's not only hard to adapt the approach but then you wonder if you even captured the value 😅 Can you even call it 'data contracts' or whatever you are working on?! Imposter syndrome is very common here. Scott note: you could definitely call what Ryan and team are doing data contracts :)
Ryan also talked about how in data contracts, you must build for change. Change is the only constant after all. So creating systems that don't handle change well is a great way to manufacture more headaches down the road. Much like in software testing, you can more easily tell when something no longer works and needs to be changed. And when the data team is the actual data producer - if the data team are the ones transforming the data, that's often the case or at least is the only group of people consumers talk to with a centralized data team - they are much more sure that what they are doing is correct.
Another key learning Ryan had along the journey was that when displaying data quality, make the metrics more easy to understand to the layperson. Historically, data quality has been measured with complex statistics. Most people can't easily read the charts from that to understand what's going on. Make the data quality metrics understandable so people can see progress but also get a sense of how well they can rely on data. It is a sad truth that you can deliver value but if you can't get others to see that value, it isn't valued. Showing that value gets people to lean in.
Ryan dug a bit deeper into creating systems that act with empathy. If you approach data contracts as consumers only get what the producer shares, that doesn't end up serving the end needs that well. But if you are treating the contracts as the culmination of multiple conversations, the producer can start to really understand the impact of bad data. How much work do data consumers have to do to actually use the data? This is where empathy and product thinking come in.
"…data, as we know, it is merely a side effect of activity, of stuff happening." Ryan believes we need to move past the 1s and 0s thinking in data and focus on what it reflects and how that impacts the people in the organization. Conversations can be hard but they give you the context necessary to maximize the impact of your deep systems work. Talking with people can help both parties bridge the gap between understanding what is happening in the real world versus the data 😅
Internally in Ryan's org, they wanted to review their general processes. Part of that was the uncomfortable truth that change, especially to processes, impacts the data. So that review created a great opportunity to start to implement data contracts. It wasn't about telling people they were doing data contracts, it was about getting people bought in to what value could be delivered if they did data quality and trust better. It just happened to be via data contracts.
When actually starting out, Ryan looked for one ally that was willing to take on some of the complexity of dealing with data contracts and saw the potential benefits. Instead of trying to convert the whole organization, it was contained and let Ryan learn how to implement data contracts well in his specific organization. That initial success gave him the confidence to move further and the success story to entice additional partners/allies.
Ryan discussed the push and pull of data quality and value. While it might be valuable to have a long history of data, is the cleanup worth it? Really have conversations and make hard choices that align to return on investment instead of merely do consumers want it. Similarly, people need to confront the idea of data being right or wrong. They need to consider what is the cost of some data being wrong, especially slightly off. If that's for a regulator, potentially high. But if it's your weekly marketing leads report and it's off by 0.2%, how big of a deal is that? And how much trust is lost if it's wrong? Can we get people to understand data is never 100% clean/right? Getting people to act on signals will likely be somewhat challenging but it's a better way to navigate than trying to wait for exact measurement in many - most? - cases.
Ryan wrapped up back on dealing with yourself and others with empathy. You might not get it right at first but if there's trust, you can iterate towards better together. That goes for your data, your processes, and your relationships.
Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode