

Measuring Bias, Toxicity, and Truthfulness in LLMs With Python
20 snips Jan 19, 2024
Jodie Burchell, developer advocate for data science at JetBrains, discusses techniques and tools for evaluating large language models (LLMs) using Python. They explore measuring bias, toxicity, and truthfulness in LLMs, the challenges and limitations of AI language models, the role of Python packages like Hugging Face, and the concept of grouping and acronyms. Jodie also shares benchmarking datasets and resources available on Hugging Face for evaluating LLMs.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9
Introduction
00:00 • 2min
Testing bias, toxicity, and truthfulness in large language models
01:52 • 13min
Measuring Bias, Toxicity, and Truthfulness in LLMs
14:39 • 2min
Analyzing Safeguards and Limitations of AI Language Models
16:14 • 14min
Building AI Apps and Using Python Packages
30:31 • 2min
Hugging Face: Making Generative AI Accessible
32:08 • 18min
Discussing Grouping and Acronyms
50:33 • 2min
Measuring Bias, Toxicity, and Truthfulness in LLMs
52:38 • 21min
Wrapping Up and Future Plans
01:13:45 • 2min