Reinforcement Learning from Human Feedback | 2min snip from The Real Python Podcast

Get the app

Measuring Bias, Toxicity, and Truthfulness in LLMs With Python

The Real Python Podcast

chevron_right

notes

NOTE

Reinforcement Learning from Human Feedback

A sophisticated system was designed where a large language model was fine-tuned using repeated prompts and manual ratings regarding quality, bias, toxicity, and damage of the answers. Another model was then trained to predict these ratings, and both models were integrated into a feedback loop. This system allows the fine-tuned model to adjust slightly based on the quality ratings received, a process named reinforcement learning from human feedback, ensuring outputs align more closely with desirable responses over time.

00:00

Transcript

chevron_right

Play full episode

chevron_right

Transcript

Episode notes

How can you measure the quality of a large language model? What tools can measure bias, toxicity, and truthfulness levels in a model using Python? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, returns to discuss techniques and tools for evaluating LLMs With Python.

Jodie provides some background on large language models and how they can absorb vast amounts of information about the relationship between words using a type of neural network called a transformer. We discuss training datasets and the potential quality issues with crawling uncurated sources.

We dig into ways to measure levels of bias, toxicity, and hallucinations using Python. Jodie shares three benchmarking datasets and links to resources to get you started. We also discuss ways to augment models using agents or plugins, which can access search engine results or other authoritative sources.

This week’s episode is brought to you by Intel.

Course Spotlight: Learn Text Classification With Python and Keras

In this course, you’ll learn about Python text classification with Keras, working your way from a bag-of-words model with logistic regression to more advanced methods, such as convolutional neural networks. You’ll see how you can use pretrained word embeddings, and you’ll squeeze more performance out of your model through hyperparameter optimization.

Topics: