How to Evaluate a Model Behavior

If you started with generating an explanation, if my model outputs a sentence that says why it predicted something and that thing is supposed to match what a human would say, that actually doesn't constrain at all the model. And so if you are interested in explaining the model behavior, then you have to start thinking of evaluations that are focusing purely on that. So there are a bunch of evaluations that people have done where you don't even think about the user in the loop or anything like that. What you try to do is figure out using some other techniques, things that are definitely not important for the model or things that were definitely used for the model.

Play episode from 46:00

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app