Adapt and Detect: Embrace New Data Challenges

4min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

When developing machine learning models, it is crucial to anticipate variations in incoming data versus training data, whether due to changed contexts (like dogs on different surfaces), the emergence of new categories (such as parakeets alongside cats and dogs), or entirely new data regimes (like infections from a novel virus). Robust models should not only adapt to these shifts but also minimize failure rates when faced with unfamiliar data types. Moreover, effective detection mechanisms for new classes or intents are essential, especially in critical applications like self-driving cars and healthcare, where new symptoms or customer queries may arise. This capability enables systems to revert to cautious protocols or trigger human intervention, thus maintaining operational integrity as data dynamics evolve.

A major challenge in applied AI is out-of-distribution detection, or OOD, which is the task of detecting instances that do not belong to the distribution the classifier has been trained on. OOD data is often referred to as “unseen” data, as the model has not encountered it during training.

Bayan Bruss is the VP of AI Foundations at Capital One and in this role he works with academic researchers to translate the latest research to address fundamental problems in financial services. Bayan joins the show with Sean Falconer to talk about OOD, the importance of bringing AI research to real world applications, and more.

Full Disclosure: This episode is sponsored by Capital One

Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.