Post Deployment Data Science with Wojtek Kuberski #56
Feb 13, 2025
auto_awesome
Wojtek Kuberski, Co-Founder and CTO at NannyML, shares his expertise in AI and data science. He discusses the challenges of model monitoring post-deployment, including covariate shift and concept drift. Wojtek explains how NannyML's algorithms assess model performance without needing labels. He emphasizes the critical role of continuous monitoring to prevent silent failures that could impact businesses. The conversation also touches on his transition from freelancing to building a product-focused company in the ever-evolving data science landscape.
Freelancers can enhance their skills and adapt quickly by tackling challenging projects and seeking mentorship from experienced professionals.
NannyML addresses the critical need for effective post-deployment model monitoring to counteract issues like data drift and concept drift.
Deep dives
Learning and Growth in Freelancing
Freelancers often lack senior guidance, which makes learning challenging. One effective way to grow is to seek mentorship from experienced professionals, while another approach is to tackle projects that are beyond one's current capabilities. Engaging with difficult tasks forces freelancers to adapt quickly by diving into research and real-world problem-solving, often leading them to develop valuable skills and insights on the job. For instance, individuals may need to self-educate on complex technical topics, such as machine learning, by studying literature and learning new coding practices while handling client projects.
Transitioning from Freelancing to Product Development
The decision to shift from freelance work to launching a product company can stem from various motivations, including market demands and personal circumstances. Many data scientists observed consistent client requests for effective model monitoring solutions after project deployment, indicating a gap in the market. The pandemic further emphasized this need, as some freelancers faced client losses and realized they had the opportunity to innovate. Thus, this environment provided a perfect backdrop for founding NannyML, a company focusing on post-deployment monitoring and maintaining model performance.
The Importance of Model Monitoring
Models in production can experience issues such as data drift or concept drift, which can significantly affect their performance. Data drift occurs when the input data distribution changes, while concept drift happens when the relationship between input data and the predicted outcome shifts over time. These changes can lead to models underperforming or making incorrect predictions, which may be detrimental to businesses relying on those predictions. To proactively address these issues, effective model monitoring systems must be implemented to detect and analyze these drifts, ensuring that models maintain high performance.
Navigating Challenges in Model Performance
Understanding the root causes of model performance issues is a crucial element of effective monitoring. Key metrics, such as data quality indicators and distribution shifts, play an essential role in diagnosing problems that hinder model success. For instance, businesses need to track their data quality metrics closely to identify anomalies, such as sudden increases in missing values, which may indicate a data pipeline problem. Additionally, organizations should focus on correlating specific features with performance metrics to better understand how different factors contribute to model behavior over time, ultimately leading to informed decisions on model updates or retraining.
Our guest today is Wojtek Kuberski, Co-Founder and CTO at NannyML.
In our conversation, we first discuss Wojtek's experience working as a freelancer. We then talk about NannyML: the platform for post deployment Data Science. We dive deep into model monitoring and discuss the key causes of model failure including covariate shift, concept drift and bad data quality. Wojtek also explains how NannyML's algorithms can estimate model performance without access to labels.
If you enjoyed the episode, please leave a 5 star review and subscribe to the AI Stories Youtube channel.