AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Machine Learning Systems From a Datea Perspective
In general, really focusing on the discipline of model valuation is something that i think is incredibly important. There are some good ways to use the data that we already have to help probe and there are two that i quite like. We can first cluster our data before doing any training, and then do the moral equivalent of cross validation. But instead of holding hold out, we hold one of these clusters out. And so we're creating a world in which our models explicitly have some blind spots,. That blind spot corresponds to a cluster tha we then use as testato. If it's very far away from its nearest neighbor in the training data, then that's probably a pretty tricky example to get right