Understanding and Evaluating Toxicity in Language Models | 1min snip from The Real Python Podcast

Get the app

Measuring Bias, Toxicity, and Truthfulness in LLMs With Python

The Real Python Podcast

chevron_right

notes

NOTE

Understanding and Evaluating Toxicity in Language Models

Toxicity in language models refers to the tendency to generate content that is hateful towards certain groups. It is crucial to clearly define toxicity to comprehend the significance of the evaluation. The assessment involves completing bias prompts, altering pronouns, and utilizing a toxicity metric to evaluate the completed prompt through a hate speech classifier, providing a raw probability where zero indicates non-hate speech.

00:00

Transcript

chevron_right

Play full episode

chevron_right

Transcript

Episode notes

How can you measure the quality of a large language model? What tools can measure bias, toxicity, and truthfulness levels in a model using Python? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, returns to discuss techniques and tools for evaluating LLMs With Python.

Jodie provides some background on large language models and how they can absorb vast amounts of information about the relationship between words using a type of neural network called a transformer. We discuss training datasets and the potential quality issues with crawling uncurated sources.

We dig into ways to measure levels of bias, toxicity, and hallucinations using Python. Jodie shares three benchmarking datasets and links to resources to get you started. We also discuss ways to augment models using agents or plugins, which can access search engine results or other authoritative sources.

This week’s episode is brought to you by Intel.

Course Spotlight: Learn Text Classification With Python and Keras

In this course, you’ll learn about Python text classification with Keras, working your way from a bag-of-words model with logistic regression to more advanced methods, such as convolutional neural networks. You’ll see how you can use pretrained word embeddings, and you’ll squeeze more performance out of your model through hyperparameter optimization.

Topics: