Detecting a Bias in a Large Language Model?

When we started out on our project, we probably took a couple of weeks to think about and clearly define what bias actually is. But I think after doing the work we have done, I'm probably even less confident in my personal definition of it. However, when you look into the literature, there have been a different attempts to quantify it on the model level. And then you can think about, for example, if a model like GPTF3 produces text, what are different words that are maybe associated with gender stereotypes in this text? We might go about it, but I will probably stop there.

Play episode from 10:15

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app