#51 - Data Observability w/ Lior Gavish (Monte Carlo)
Oct 11, 2021
auto_awesome
Lior Gavish, CTO, co-founder @ Monte Carlo, talks about data observability and its importance in ensuring reliability and trust in data products. They discuss success stories of selling the idea of data observability, the importance of data quality and governance in versioned data, and how business users can benefit from the data observability platform, Monte Carlo. They also discuss the future of data and mention the challenges of data discovery and the evolution of data management practices.
Data observability is crucial for treating data as a reliable and trustworthy product used for analytics, decision-making, machine learning models, or data stores that are part of a digital experience.
Data lineage is critical for data observability, allowing teams to monitor the reliability of data products, identify issues, understand dependencies, trace data quality problems back to their source, and prevent downtime.
Deep dives
Defining Data Observability
Data observability is the process of measuring the reliability and health of data products, which are treated as products within a company. It involves treating data as a product used for analytics, decision-making, machine learning models, or data stores that are part of a digital experience. Productizing data requires ensuring its reliability and trustworthiness, similar to observability in software engineering. Data observability provides visibility into how the system works, allowing teams to proactively manage and resolve issues, as well as prevent them. Companies are investing in data observability to maximize the value of their data and deliver reliable and trustworthy data products.
The Importance of Data Lineage
Data lineage, which involves understanding the origin, flow, and transformations of data, is critical for data observability. It helps teams monitor the reliability of data products and quickly identify and resolve issues. Data lineage enables teams to understand dependencies, trace data quality issues back to their source, and prevent downtime. Monte Carlo offers a hybrid solution, allowing customers to install elements of their system within their own infrastructure to minimize external exposure and maintain control over their data while delivering a software-as-a-service experience.
Challenges and Opportunities in the Data Industry
The data industry is rapidly evolving, with increasing focus on productizing data and building data products. This trend presents challenges such as managing data complexity, ensuring reliability, and enabling self-serve access to data products. There is room for innovation in areas like data discovery, which helps users find and understand data products, and in addressing security and privacy concerns. The industry is moving towards a cloud-based, distributed data stack that enables velocity, self-serve autonomy, and greater control and observability.
The Future of Data Observability
The future of data observability will likely involve continued innovation in tools and methodologies for managing and ensuring the reliability of data products. The industry will see consolidation and simplified tooling to address the complexities of building and managing data products. Companies will prioritize data quality, lineage, and governance, and provide more visibility and control to business users. The focus will be on maximizing the value of data, delivering highly reliable data products, and building a culture of operational discipline and understanding the impact of data on business outcomes.
Lior Gavish (CTO, co-founder @ Monte Carlo) joins the Monday Morning Data Chat to discuss data observability, and how you can start doing it in your stack today.
Streamed live on LinkedIn and YouTube
#data #dataengineering #machinelearning
---------------------------------
TERNARY DATA
We are Matt and Joe, and we’re "recovering data scientists". Together, we run a data architecture company called Ternary Data. Ternary Data is not your typical data consultancy. Get no-nonsense, no BS data engineering strategy, coaching, and advice. Trusted by great companies, both huge and small.