AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
When the podcast guest joined Big Panda, the company had operational databases and some analytics capabilities, but there were many questions that couldn't be answered due to the lack of data. Recognizing the need for a larger and more maintainable data platform, the guest and a task force team embarked on building a centralized data platform. Their first use case was to support a new business model, leveraging the data already in their operational databases for analytics. They started with a basic infrastructure, including Snowflake, Upsolver, and dbt, and focused on democratizing the data while being mindful of costs. As the team grew and defined ownership boundaries, communication became crucial, both within the team and across global locations. Technical working groups, regular presentations, and direct communication channels were established to foster collaboration and ensure everyone had a voice in the data platform journey.
One key indicator for investing in data is when product managers start asking questions about specific data points and there is no easy access to that data. This indicates a need for a data platform to provide answers and insights. Another indicator is when R&D teams are unaware of the scale of data being processed in their systems, signaling a lack of visibility into data flow and the need for better measurement and analysis. Additionally, as a B2B company grows, there comes a point where important questions about the product and its performance cannot be answered without robust data analysis. Deciding the right time to invest in data infrastructure and resources is a challenge, but recognizing these indicators can help guide the decision-making process.
Effective communication and collaboration are crucial to the success of the data platform. The podcast guest implemented technical working groups, including representatives from various teams, to facilitate discussions and information exchange. Group presentations and regular updates helped disseminate information and gather feedback. The guest emphasized the importance of clear ownership and accountability for data domains, not only within the data team but also across the organization. Alongside internal communication, the guest highlighted the value of engaging external customers and understanding their needs. Over time, the team learned to adjust the composition of communication groups and involve engineers to ensure a holistic and informed approach to data management and decision-making.
In this podcast episode, the speaker emphasizes the significance of establishing a robust data infrastructure within a company. They describe the process their team followed, which involved creating a streamlined process for data creation and review. This infrastructure not only allowed for easier data democratization but also ensured that data ownership rested with the respective teams. The speaker acknowledges that not all teams fully owned their objects initially due to readiness factors and challenges in defining the responsible team for certain data objects. Overall, the creation of a solid data infrastructure was seen as a crucial first step in enabling effective data utilization and value creation.
Another key point discussed in this podcast episode is the importance of being mindful and intentional in decision-making when it comes to data management. The speaker emphasizes the need to focus on the value and purpose of the data being collected and not simply accumulating data without a clear strategy. They highlight the role of product management in guiding this decision-making process and articulating the reasons behind data initiatives. Additionally, the speaker mentions the challenges of cost management and the need to balance real-time data requirements with the actual value it brings. Overall, the key takeaway is to approach data management with a clear understanding of the goals and value it can provide to both internal and external stakeholders.
Please Rate and Review us on your podcast app of choice!
Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Corrin's LinkedIn: https://www.linkedin.com/in/corrin/
In this episode, Scott interviewed Corrin Shlomo Goldenberg, Senior Product Manager of the Data Platform at BigPanda.
It's important to note that BigPanda is not at the stage yet where data mesh makes sense but this is a story of getting production of data into the heads and hearts of the application development team, which is a crucial aspect to doing data mesh well, whether it's done pre data mesh or as part of the journey.
Some key takeaways/thoughts from Corrin's point of view:
Corrin started with the tale of BigPanda and how she started building out their data, ML, and analytics capabilities. When she came in, they didn't have the infrastructure or really the focus on a scalable platform for storing and analyzing their internal data. They were doing a lot of this for external clients but hadn't moved to doing it internally, which is pretty common in B2B startups. But BigPanda wanted to do a data driven transformation of their business model so they had to change the situation around their internal data.
There is always a balance for when you start collecting data at scale in Corrin's mind. At a B2B startup, you need to ask how early should it be for the company but the same is applicable for an early-stage offering at a larger organization. Most development teams aren't tasked with dealing with creating the necessary data until far later in an offering's lifecycle but it would be nice if you could include it at the start. But it definitely isn't free so there is always a balance and the conversations need happen, hopefully earlier than later.
Corrin's tipping point for when you should really start to press development teams on creating necessary data is when it becomes hard to answer simple 'how many' type questions. It is also an easier conversation than a hypothetical one. If it takes more than a day to get basic information on how your customers are using your product, that's obviously an issue that's only going to grow. It's also a pretty tangible place to start.
When they started to build out the data platform, Corrin said it just made sense to start centralized. If the R&D team wasn't really thinking about data, trying to upskill them enough to take over the work entirely was probably a bridge too far. Plus, if your data requirements aren't complex enough to require decentralization, decentralization is often just an extra layer of complexity. So they moved to a high communication model where people can see what data work is happening even if it's controlled by the central team. They can slowly upskill the development teams to understand data instead of trying to hand over ownership prematurely.
Corrin talked about working with the team to understand the product mindset to data. Start from the why - it's easy to fall into the trap of trying to do everything because it might have value. That's what happened with data lakes that became data swamps. Focus people on the why and you can bring them more and more into working with data.
Similarly, while Corrin and team didn't have a lot of pushback on getting things done, she was very cognizant of prioritization and cost/benefit. Again, focusing on 'the why': what is most important and when? Why are the requirements like this? Can we cut the cost down by storing for less time and/or refreshing less often? When you say 'real time', what do you actually mean? Etc.
Corrin has been seeing good results from having strong ownership conversations. While the central team still owns the data, they are partnering with the domains as the domains still need to own the concepts and the understanding of the information. While this might not work at a large scale, it's perfectly normal and functional at a 300 person company. Scott note: centralization isn't the enemy until it becomes a bottleneck 😎
As with all global companies, BigPanda has some challenges around communication, per Corrin. Time zone differences and of course differences in focus are just two of them. So she recommends spending a lot of time to communicate to stakeholders about what you are building and why. It's easy to assume that because you build out a data product, people will use it but you have to work with people to ensure they actually use what you built.
Corrin pointed to the fact that many companies in the B2B space feel they aren't "data oriented" enough. She gave a few tips for how to become more data oriented but also has empathy for people feeling that - it's pretty common, most B2B companies feels they aren't as data oriented as everyone else. Similar to data mesh, where everyone believes all the other companies are far down their path. It's simply optics - companies project a better image than the reality of their situation with data.
Learn more about Data Mesh Understanding: https://datameshunderstanding.com/about
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode